Skip to content

daft.functions.to_datetime#

to_datetime #

to_datetime(expr: Expression, format: str, timezone: str | None = None) -> Expression

Converts a string to a datetime using the specified format and timezone.

Returns:

Name Type Description
Expression Expression

a DateTime expression which is parsed by given format and timezone

Note

The format must be a valid datetime format string. See: https://docs.rs/chrono/latest/chrono/format/strftime/index.html

Examples:

1
2
3
4
5
>>> import daft
>>> from daft.functions import to_datetime
>>> df = daft.from_pydict({"x": ["2021-01-01 00:00:00.123", "2021-01-02 12:30:00.456", None]})
>>> df = df.with_column("datetime", to_datetime(df["x"], "%Y-%m-%d %H:%M:%S%.3f"))
>>> df.show()
╭─────────────────────────┬─────────────────────────╮
│ x                       ┆ datetime                │
│ ---                     ┆ ---                     │
│ String                  ┆ Timestamp[ms]           │
╞═════════════════════════╪═════════════════════════╡
│ 2021-01-01 00:00:00.123 ┆ 2021-01-01 00:00:00.123 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2021-01-02 12:30:00.456 ┆ 2021-01-02 12:30:00.456 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ None                    ┆ None                    │
╰─────────────────────────┴─────────────────────────╯
(Showing first 3 of 3 rows)

If a timezone is provided, the datetime will be parsed in that timezone

1
2
3
>>> df = daft.from_pydict({"x": ["2021-01-01 00:00:00.123 +0800", "2021-01-02 12:30:00.456 +0800", None]})
>>> df = df.with_column("datetime", to_datetime(df["x"], "%Y-%m-%d %H:%M:%S%.3f %z", timezone="Asia/Shanghai"))
>>> df.show()
╭───────────────────────────────┬──────────────────────────────╮
│ x                             ┆ datetime                     │
│ ---                           ┆ ---                          │
│ String                        ┆ Timestamp[ms; Asia/Shanghai] │
╞═══════════════════════════════╪══════════════════════════════╡
│ 2021-01-01 00:00:00.123 +0800 ┆ 2021-01-01 00:00:00.123 CST  │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2021-01-02 12:30:00.456 +0800 ┆ 2021-01-02 12:30:00.456 CST  │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ None                          ┆ None                         │
╰───────────────────────────────┴──────────────────────────────╯
(Showing first 3 of 3 rows)
Source code in daft/functions/datetime.py
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
def to_datetime(expr: Expression, format: str, timezone: str | None = None) -> Expression:
    """Converts a string to a datetime using the specified format and timezone.

    Returns:
        Expression: a DateTime expression which is parsed by given format and timezone

    Note:
        The format must be a valid datetime format string. See: https://docs.rs/chrono/latest/chrono/format/strftime/index.html

    Examples:
        >>> import daft
        >>> from daft.functions import to_datetime
        >>> df = daft.from_pydict({"x": ["2021-01-01 00:00:00.123", "2021-01-02 12:30:00.456", None]})
        >>> df = df.with_column("datetime", to_datetime(df["x"], "%Y-%m-%d %H:%M:%S%.3f"))
        >>> df.show()
        ╭─────────────────────────┬─────────────────────────╮
        │ x                       ┆ datetime                │
        │ ---                     ┆ ---                     │
        │ String                  ┆ Timestamp[ms]           │
        ╞═════════════════════════╪═════════════════════════╡
        │ 2021-01-01 00:00:00.123 ┆ 2021-01-01 00:00:00.123 │
        ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
        │ 2021-01-02 12:30:00.456 ┆ 2021-01-02 12:30:00.456 │
        ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
        │ None                    ┆ None                    │
        ╰─────────────────────────┴─────────────────────────╯
        <BLANKLINE>
        (Showing first 3 of 3 rows)

        If a timezone is provided, the datetime will be parsed in that timezone

        >>> df = daft.from_pydict({"x": ["2021-01-01 00:00:00.123 +0800", "2021-01-02 12:30:00.456 +0800", None]})
        >>> df = df.with_column("datetime", to_datetime(df["x"], "%Y-%m-%d %H:%M:%S%.3f %z", timezone="Asia/Shanghai"))
        >>> df.show()
        ╭───────────────────────────────┬──────────────────────────────╮
        │ x                             ┆ datetime                     │
        │ ---                           ┆ ---                          │
        │ String                        ┆ Timestamp[ms; Asia/Shanghai] │
        ╞═══════════════════════════════╪══════════════════════════════╡
        │ 2021-01-01 00:00:00.123 +0800 ┆ 2021-01-01 00:00:00.123 CST  │
        ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
        │ 2021-01-02 12:30:00.456 +0800 ┆ 2021-01-02 12:30:00.456 CST  │
        ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
        │ None                          ┆ None                         │
        ╰───────────────────────────────┴──────────────────────────────╯
        <BLANKLINE>
        (Showing first 3 of 3 rows)

    """
    return Expression._call_builtin_scalar_fn("to_datetime", expr, format=format, timezone=timezone)