Returns the number of months between two dates or timestamps.
Mirrors Spark's months_between: returns an integer when both inputs share the same day-of-month or are both the last day of their respective months; otherwise returns months_diff + (day1 - day2 + (time1 - time2)/86400) / 31 rounded to eight decimal places.
Parameters:
| Name | Type | Description | Default |
end | Expression | The end Date or Timestamp expression. | required |
start | Expression | The start Date or Timestamp expression. | required |
Returns:
| Name | Type | Description |
Expression | Expression | a Float64 expression with the number of months (end - start). |
Examples:
| >>> import daft
>>> from daft.functions import months_between
>>> df = daft.from_pydict({"a": ["1997-02-28"], "b": ["1996-10-30"]})
>>> df = df.with_column("a", df["a"].cast(daft.DataType.date()))
>>> df = df.with_column("b", df["b"].cast(daft.DataType.date()))
>>> df = df.with_column("diff", months_between(df["a"], df["b"]))
>>> df.schema()["diff"].dtype == daft.DataType.float64()
|
Source code in daft/functions/datetime.py
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491 | def months_between(end: Expression, start: Expression) -> Expression:
"""Returns the number of months between two dates or timestamps.
Mirrors Spark's ``months_between``: returns an integer when both inputs share the
same day-of-month or are both the last day of their respective months; otherwise
returns ``months_diff + (day1 - day2 + (time1 - time2)/86400) / 31`` rounded to
eight decimal places.
Args:
end: The end Date or Timestamp expression.
start: The start Date or Timestamp expression.
Returns:
Expression: a Float64 expression with the number of months (end - start).
Examples:
>>> import daft
>>> from daft.functions import months_between
>>> df = daft.from_pydict({"a": ["1997-02-28"], "b": ["1996-10-30"]})
>>> df = df.with_column("a", df["a"].cast(daft.DataType.date()))
>>> df = df.with_column("b", df["b"].cast(daft.DataType.date()))
>>> df = df.with_column("diff", months_between(df["a"], df["b"]))
>>> df.schema()["diff"].dtype == daft.DataType.float64()
True
"""
return Expression._call_builtin_scalar_fn("months_between", end, start)
|