Skip to content

daft.functions.json_object_keys#

json_object_keys #

json_object_keys(expr: Expression) -> Expression

Returns the top-level keys of a JSON object as a list of strings.

Returns NULL when the input is NULL, cannot be parsed as JSON, or the parsed JSON is not an object. Returns an empty list for empty objects.

Note

Keys are returned in sorted alphabetical order, not source insertion order. This differs from Spark's json_object_keys, which preserves insertion order.

Parameters:

Name Type Description Default
expr Expression

A string expression containing JSON.

required

Returns:

Name Type Description
Expression Expression

A List[String] expression with the object's keys.

Examples:

1
2
3
4
5
>>> import daft
>>> from daft.functions import json_object_keys
>>>
>>> df = daft.from_pydict({"col": ['{"a": 1, "b": 2}', "{}", "[1, 2]", None]})
>>> df.with_column("keys", json_object_keys(df["col"])).collect()
╭──────────────────┬──────────────╮
│ col              ┆ keys         │
│ ---              ┆ ---          │
│ String           ┆ List[String] │
╞══════════════════╪══════════════╡
│ {"a": 1, "b": 2} ┆ [a, b]       │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ {}               ┆ []           │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ [1, 2]           ┆ None         │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ None             ┆ None         │
╰──────────────────┴──────────────╯
(Showing first 4 of 4 rows)
Source code in daft/functions/str.py
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
def json_object_keys(expr: Expression) -> Expression:
    """Returns the top-level keys of a JSON object as a list of strings.

    Returns ``NULL`` when the input is ``NULL``, cannot be parsed as JSON,
    or the parsed JSON is not an object. Returns an empty list for empty
    objects.

    Note:
        Keys are returned in **sorted alphabetical order**, not source
        insertion order. This differs from Spark's ``json_object_keys``,
        which preserves insertion order.

    Args:
        expr: A string expression containing JSON.

    Returns:
        Expression: A ``List[String]`` expression with the object's keys.

    Examples:
        >>> import daft
        >>> from daft.functions import json_object_keys
        >>>
        >>> df = daft.from_pydict({"col": ['{"a": 1, "b": 2}', "{}", "[1, 2]", None]})
        >>> df.with_column("keys", json_object_keys(df["col"])).collect()
        ╭──────────────────┬──────────────╮
        │ col              ┆ keys         │
        │ ---              ┆ ---          │
        │ String           ┆ List[String] │
        ╞══════════════════╪══════════════╡
        │ {"a": 1, "b": 2} ┆ [a, b]       │
        ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
        │ {}               ┆ []           │
        ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
        │ [1, 2]           ┆ None         │
        ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
        │ None             ┆ None         │
        ╰──────────────────┴──────────────╯
        <BLANKLINE>
        (Showing first 4 of 4 rows)
    """
    return Expression._call_builtin_scalar_fn("json_object_keys", expr)