Counts the occurrences of each distinct value in the list.
Parameters:
| Name | Type | Description | Default |
list_expr | List Expression | expression to count the occurrences of each distinct value in. | required |
Returns:
| Name | Type | Description |
Expression | Map Expression | A Map expression where the keys are distinct elements from the original list of type X, and the values are UInt64 counts representing the number of times each element appears in the list. |
Note
This function does not work for nested types. For example, it will not produce a map with lists as keys.
Examples:
| >>> import daft
>>> from daft.functions import value_counts
>>> df = daft.from_pydict({"letters": [["a", "b", "a"], ["b", "c", "b", "c"]]})
>>> df.with_column("value_counts", value_counts(df["letters"])).collect()
|
╭──────────────┬─────────────────────╮
│ letters ┆ value_counts │
│ --- ┆ --- │
│ List[String] ┆ Map[String: UInt64] │
╞══════════════╪═════════════════════╡
│ [a, b, a] ┆ {"a": 2, "b": 1} │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [b, c, b, c] ┆ {"b": 2, "c": 2} │
╰──────────────┴─────────────────────╯
(Showing first 2 of 2 rows)
Source code in daft/functions/list.py
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44 | def value_counts(list_expr: Expression) -> Expression:
"""Counts the occurrences of each distinct value in the list.
Args:
list_expr (List Expression): expression to count the occurrences of each distinct value in.
Returns:
Expression (Map Expression):
A Map<X, UInt64> expression where the keys are distinct elements from the
original list of type X, and the values are UInt64 counts representing
the number of times each element appears in the list.
Note:
This function does not work for nested types. For example, it will not produce a map
with lists as keys.
Examples:
>>> import daft
>>> from daft.functions import value_counts
>>> df = daft.from_pydict({"letters": [["a", "b", "a"], ["b", "c", "b", "c"]]})
>>> df.with_column("value_counts", value_counts(df["letters"])).collect()
╭──────────────┬─────────────────────╮
│ letters ┆ value_counts │
│ --- ┆ --- │
│ List[String] ┆ Map[String: UInt64] │
╞══════════════╪═════════════════════╡
│ [a, b, a] ┆ {"a": 2, "b": 1} │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [b, c, b, c] ┆ {"b": 2, "c": 2} │
╰──────────────┴─────────────────────╯
<BLANKLINE>
(Showing first 2 of 2 rows)
"""
return Expression._call_builtin_scalar_fn("list_value_counts", list_expr)
|