Skip to content

daft.functions.length_bytes#

length_bytes #

length_bytes(expr: Expression) -> Expression

Retrieves the length for a UTF-8 string column in bytes.

Returns:

Name Type Description
Expression Expression

an UInt64 expression with the length of each string

Examples:

1
2
3
4
5
>>> import daft
>>> from daft.functions import length_bytes
>>> df = daft.from_pydict({"x": ["๐Ÿ˜‰test", "heyฬ†", "baz"]})
>>> df = df.select(length_bytes(df["x"]))
>>> df.show()
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ x      โ”‚
โ”‚ ---    โ”‚
โ”‚ UInt64 โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 8      โ”‚
โ”œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ”ค
โ”‚ 5      โ”‚
โ”œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ”ค
โ”‚ 3      โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
(Showing first 3 of 3 rows)
Source code in daft/functions/str.py
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
def length_bytes(expr: Expression) -> Expression:
    """Retrieves the length for a UTF-8 string column in bytes.

    Returns:
        Expression: an UInt64 expression with the length of each string

    Examples:
        >>> import daft
        >>> from daft.functions import length_bytes
        >>> df = daft.from_pydict({"x": ["๐Ÿ˜‰test", "heyฬ†", "baz"]})
        >>> df = df.select(length_bytes(df["x"]))
        >>> df.show()
        โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
        โ”‚ x      โ”‚
        โ”‚ ---    โ”‚
        โ”‚ UInt64 โ”‚
        โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•ก
        โ”‚ 8      โ”‚
        โ”œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ”ค
        โ”‚ 5      โ”‚
        โ”œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ”ค
        โ”‚ 3      โ”‚
        โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
        <BLANKLINE>
        (Showing first 3 of 3 rows)

    """
    return Expression._call_builtin_scalar_fn("length_bytes", expr)