Skip to content

daft.functions.random_int#

random_int #

random_int(low: int, high: int, seed: int | None = None) -> Expression

Generates a column of random integer values.

Values are generated uniformly and independently in the closed interval [low, high].

Passing a seed makes generation best-effort stable for the same evaluated row layout, but it is not guaranteed to be reproducible across repartitioning or other layout changes.

Parameters:

Name Type Description Default
low int

Inclusive lower bound.

required
high int

Inclusive upper bound.

required
seed int | None

Optional seed for best-effort stable generation.

None

Returns:

Name Type Description
Expression Int64 Expression

An expression that generates random integers.

Examples:

1
2
3
4
5
6
>>> import daft
>>> from daft.functions import random_int
>>> df = daft.from_pydict({"a": [1, 2, 3]}).with_column("r", random_int(low=10, high=20, seed=7))
>>> df.schema()["r"].dtype == daft.DataType.int64()
>>> vals = df.to_pydict()["r"]
>>> all(10 <= v <= 20 for v in vals)
True
True
Source code in daft/functions/misc.py
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
def random_int(low: int, high: int, seed: int | None = None) -> Expression:
    """Generates a column of random integer values.

    Values are generated uniformly and independently in the closed interval ``[low, high]``.

    Passing a ``seed`` makes generation best-effort stable for the same evaluated row layout,
    but it is not guaranteed to be reproducible across repartitioning or other layout changes.

    Args:
        low: Inclusive lower bound.
        high: Inclusive upper bound.
        seed: Optional seed for best-effort stable generation.

    Returns:
        Expression (Int64 Expression): An expression that generates random integers.

    Examples:
        >>> import daft
        >>> from daft.functions import random_int
        >>> df = daft.from_pydict({"a": [1, 2, 3]}).with_column("r", random_int(low=10, high=20, seed=7))
        >>> df.schema()["r"].dtype == daft.DataType.int64()
        True
        >>> vals = df.to_pydict()["r"]
        >>> all(10 <= v <= 20 for v in vals)
        True
    """
    return Expression._call_builtin_scalar_fn("random_int", low, high, seed=seed)