Skip to content

daft.functions.slice#

slice #

slice(expr: Expression, start: int | Expression, end: int | Expression | None = None) -> Expression

Get a subset of each list or binary value.

Parameters:

Name Type Description Default
expr Expression

List or binary expression to slice.

required
start int | Expression

Index or column of indices. The slice will include elements starting from this index. If start is negative, it represents an offset from the end

required
end int | Expression | None

Index or column of indices. The slice will not include elements from this index onwards. If end is negative, it represents an offset from the end. If not provided, the slice will include elements up to the end of the list. If start > end, an empty slice is produced.

None

Returns:

Name Type Description
Expression Expression

an expression with the same type as the input.

Note

expr[start:stop] is also equivalent to expr.slice(start, stop)

Examples:

Slicing a list expression:

1
2
3
4
>>> import daft
>>> df = daft.from_pydict({"x": [[1, 2, 3], [4, 5, 6, 7], [8]]})
>>> df = df.select(df["x"].slice(1, -1))
>>> df.show()
╭─────────────╮
│ x           │
│ ---         │
│ List[Int64] │
╞═════════════╡
│ [2]         │
├╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [5, 6]      │
├╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ []          │
╰─────────────╯
(Showing first 3 of 3 rows)

Slicing a binary expression:

1
2
3
>>> df = daft.from_pydict({"x": [b"Hello World", b"\xff\xfe\x00", b"empty"]})
>>> df = df.select(df["x"].slice(1, -2))
>>> df.show()
╭─────────────╮
│ x           │
│ ---         │
│ Binary      │
╞═════════════╡
│ b"ello Wor" │
├╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ b""         │
├╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ b"mp"       │
╰─────────────╯
(Showing first 3 of 3 rows)
Source code in daft/functions/misc.py
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
def slice(expr: Expression, start: int | Expression, end: int | Expression | None = None) -> Expression:
    r"""Get a subset of each list or binary value.

    Args:
        expr: List or binary expression to slice.
        start: Index or column of indices. The slice will include elements starting from this index. If `start` is negative, it represents an offset from the end
        end: Index or column of indices. The slice will not include elements from this index onwards. If `end` is negative, it represents an offset from the end. If not provided, the slice will include elements up to the end of the list. If start > end, an empty slice is produced.

    Returns:
        Expression: an expression with the same type as the input.

    Note:
        `expr[start:stop]` is also equivalent to `expr.slice(start, stop)`

    Examples:
        Slicing a list expression:
        >>> import daft
        >>> df = daft.from_pydict({"x": [[1, 2, 3], [4, 5, 6, 7], [8]]})
        >>> df = df.select(df["x"].slice(1, -1))
        >>> df.show()
        ╭─────────────╮
        │ x           │
        │ ---         │
        │ List[Int64] │
        ╞═════════════╡
        │ [2]         │
        ├╌╌╌╌╌╌╌╌╌╌╌╌╌┤
        │ [5, 6]      │
        ├╌╌╌╌╌╌╌╌╌╌╌╌╌┤
        │ []          │
        ╰─────────────╯
        <BLANKLINE>
        (Showing first 3 of 3 rows)

        Slicing a binary expression:
        >>> df = daft.from_pydict({"x": [b"Hello World", b"\xff\xfe\x00", b"empty"]})
        >>> df = df.select(df["x"].slice(1, -2))
        >>> df.show()
        ╭─────────────╮
        │ x           │
        │ ---         │
        │ Binary      │
        ╞═════════════╡
        │ b"ello Wor" │
        ├╌╌╌╌╌╌╌╌╌╌╌╌╌┤
        │ b""         │
        ├╌╌╌╌╌╌╌╌╌╌╌╌╌┤
        │ b"mp"       │
        ╰─────────────╯
        <BLANKLINE>
        (Showing first 3 of 3 rows)
    """
    return Expression._call_builtin_scalar_fn("slice", expr, start, end=end)