Get an index from a list expression or a field from a struct expression.
Parameters:
| Name | Type | Description | Default |
expr | List or Struct Expression | | required |
key | int | str | Expression | integer index for list or string field for struct. List index can be negative to index from the end of the list. | required |
default | Any | default value if out of bounds. Only supported for list get | None |
Returns:
| Type | Description |
Expression | An expression with the inner type of the input expression. |
Note
expr.get(x) can also be written as expr[x]
Note
expr.get("*") is equivalent to expr.unnest()
Examples:
Getting elements from a list by index:
| >>> import daft
>>> df = daft.from_pydict({"lists": [[1, 2, 3], [4, 5], [6]]})
>>> df = df.select(df["lists"].get(0).alias("first"), df["lists"].get(-1).alias("last"))
>>> df.show()
|
╭───────┬───────╮
│ first ┆ last │
│ --- ┆ --- │
│ Int64 ┆ Int64 │
╞═══════╪═══════╡
│ 1 ┆ 3 │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 4 ┆ 5 │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 6 ┆ 6 │
╰───────┴───────╯
(Showing first 3 of 3 rows)
Getting elements from a list with default value:
| >>> df = daft.from_pydict({"lists": [[1, 2], [3], []]})
>>> df = df.select(df["lists"].get(2, default=-1))
>>> df.show()
|
╭───────╮
│ lists │
│ --- │
│ Int64 │
╞═══════╡
│ -1 │
├╌╌╌╌╌╌╌┤
│ -1 │
├╌╌╌╌╌╌╌┤
│ -1 │
╰───────╯
(Showing first 3 of 3 rows)
Getting fields from a struct:
| >>> df = daft.from_pydict({"structs": [{"name": "Alice", "age": 25}, {"name": "Bob", "age": 30}]})
>>> df = df.select(df["structs"].get("name"), df["structs"].get("age"))
>>> df.show()
|
╭────────┬───────╮
│ name ┆ age │
│ --- ┆ --- │
│ String ┆ Int64 │
╞════════╪═══════╡
│ Alice ┆ 25 │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ Bob ┆ 30 │
╰────────┴───────╯
(Showing first 2 of 2 rows)
Using variable indices:
| >>> df = daft.from_pydict({"lists": [[1, 2, 3], [4, 5, 6]], "indices": [0, 2]})
>>> df = df.select(df["lists"].get(df["indices"]))
>>> df.show()
|
╭───────╮
│ lists │
│ --- │
│ Int64 │
╞═══════╡
│ 1 │
├╌╌╌╌╌╌╌┤
│ 6 │
╰───────╯
(Showing first 2 of 2 rows)
Unnesting all fields from a struct (equivalent to .unnest()):
| >>> df = daft.from_pydict({"structs": [{"x": 1, "y": 2}, {"x": 3, "y": 4}]})
>>> df = df.select(df["structs"].get("*"))
>>> df.show()
|
╭───────┬───────╮
│ x ┆ y │
│ --- ┆ --- │
│ Int64 ┆ Int64 │
╞═══════╪═══════╡
│ 1 ┆ 2 │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 3 ┆ 4 │
╰───────┴───────╯
(Showing first 2 of 2 rows)
Source code in daft/functions/misc.py
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717 | def get(expr: Expression, key: int | str | Expression, default: Any = None) -> Expression:
"""Get an index from a list expression or a field from a struct expression.
Args:
expr (List or Struct Expression): to get value from
key: integer index for list or string field for struct. List index can be negative to index from the end of the list.
default: default value if out of bounds. Only supported for list get
Returns:
An expression with the inner type of the input expression.
Note:
`expr.get(x)` can also be written as `expr[x]`
Note:
`expr.get("*")` is equivalent to `expr.unnest()`
Examples:
Getting elements from a list by index:
>>> import daft
>>> df = daft.from_pydict({"lists": [[1, 2, 3], [4, 5], [6]]})
>>> df = df.select(df["lists"].get(0).alias("first"), df["lists"].get(-1).alias("last"))
>>> df.show()
╭───────┬───────╮
│ first ┆ last │
│ --- ┆ --- │
│ Int64 ┆ Int64 │
╞═══════╪═══════╡
│ 1 ┆ 3 │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 4 ┆ 5 │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 6 ┆ 6 │
╰───────┴───────╯
<BLANKLINE>
(Showing first 3 of 3 rows)
Getting elements from a list with default value:
>>> df = daft.from_pydict({"lists": [[1, 2], [3], []]})
>>> df = df.select(df["lists"].get(2, default=-1))
>>> df.show()
╭───────╮
│ lists │
│ --- │
│ Int64 │
╞═══════╡
│ -1 │
├╌╌╌╌╌╌╌┤
│ -1 │
├╌╌╌╌╌╌╌┤
│ -1 │
╰───────╯
<BLANKLINE>
(Showing first 3 of 3 rows)
Getting fields from a struct:
>>> df = daft.from_pydict({"structs": [{"name": "Alice", "age": 25}, {"name": "Bob", "age": 30}]})
>>> df = df.select(df["structs"].get("name"), df["structs"].get("age"))
>>> df.show()
╭────────┬───────╮
│ name ┆ age │
│ --- ┆ --- │
│ String ┆ Int64 │
╞════════╪═══════╡
│ Alice ┆ 25 │
├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ Bob ┆ 30 │
╰────────┴───────╯
<BLANKLINE>
(Showing first 2 of 2 rows)
Using variable indices:
>>> df = daft.from_pydict({"lists": [[1, 2, 3], [4, 5, 6]], "indices": [0, 2]})
>>> df = df.select(df["lists"].get(df["indices"]))
>>> df.show()
╭───────╮
│ lists │
│ --- │
│ Int64 │
╞═══════╡
│ 1 │
├╌╌╌╌╌╌╌┤
│ 6 │
╰───────╯
<BLANKLINE>
(Showing first 2 of 2 rows)
Unnesting all fields from a struct (equivalent to .unnest()):
>>> df = daft.from_pydict({"structs": [{"x": 1, "y": 2}, {"x": 3, "y": 4}]})
>>> df = df.select(df["structs"].get("*"))
>>> df.show()
╭───────┬───────╮
│ x ┆ y │
│ --- ┆ --- │
│ Int64 ┆ Int64 │
╞═══════╪═══════╡
│ 1 ┆ 2 │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 3 ┆ 4 │
╰───────┴───────╯
<BLANKLINE>
(Showing first 2 of 2 rows)
"""
if isinstance(key, (int, Expression)):
return Expression._call_builtin_scalar_fn("list_get", expr, key, default)
elif isinstance(key, str):
if default is not None:
raise ValueError("`daft.functions.get` does not support default values for getting a struct field")
return Expression._from_pyexpr(expr._expr.struct_get(key))
else:
raise TypeError(
f"Argument {key} of type {type(key)} is not supported in `daft.functions.get`. Only int and string types are supported."
)
|