Skip to content

daft.functions.llm_generate#

llm_generate #

llm_generate(text: Expression, model: str = 'facebook/opt-125m', provider: Literal['vllm', 'openai'] = 'vllm', concurrency: int = 1, batch_size: int | None = None, num_cpus: int | None = None, num_gpus: int | None = None, **generation_config: dict[str, Any]) -> Expression

A UDF for running LLM inference over an input column of strings.

This UDF provides a flexible interface for text generation using various LLM providers. By default, it uses vLLM for efficient local inference.

Parameters:

Name Type Description Default
text String Expression

The input text column to generate from

required
model str, default="facebook/opt-125m"

The model identifier to use for generation

'facebook/opt-125m'
provider str, default="vllm"

The LLM provider to use for generation. Supported values: "vllm", "openai"

'vllm'
concurrency int, default=1

The number of concurrent instances of the model to run

1
batch_size int, default=None

The batch size for the UDF. If None, the batch size will be determined by defaults based on the provider.

None
num_cpus int, default=None

The number of CPUs to use for the UDF

None
num_gpus int, default=None

The number of GPUs to use for the UDF

None
generation_config dict, default={}

Configuration parameters for text generation (e.g., temperature, max_tokens)

{}

Returns:

Name Type Description
Expression String Expression

The generated text column

Examples:

Use vLLM provider:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
>>> import daft
>>> from daft import col
>>> from daft.functions import llm_generate, format
>>>
>>> df = daft.from_pydict({"city": ["Paris", "Tokyo", "New York"]})
>>> df = df.with_column(
...     "description",
...     llm_generate(
...         format(
...             "Describe the main attractions and unique features of this city: {}.",
...             col("city"),
...         ),
...         model="facebook/opt-125m",
...     ),
... )
>>> df.collect()

Use OpenAI provider:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
>>> df = daft.from_pydict({"city": ["Paris", "Tokyo", "New York"]})
>>> df = df.with_column(
...     "description",
...     llm_generate(
...         format(
...             "Describe the main attractions and unique features of this city: {}.",
...             col("city"),
...         ),
...         model="gpt-4o",
...         api_key="xxx",
...         provider="openai",
...     ),
... )
>>> df.collect()
Note

Make sure the required provider packages are installed (e.g. vllm, transformers, openai).

Source code in daft/functions/llm.py
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
def llm_generate(
    text: Expression,
    model: str = "facebook/opt-125m",
    provider: Literal["vllm", "openai"] = "vllm",
    concurrency: int = 1,
    batch_size: int | None = None,
    num_cpus: int | None = None,
    num_gpus: int | None = None,
    **generation_config: dict[str, Any],
) -> Expression:
    """A UDF for running LLM inference over an input column of strings.

    This UDF provides a flexible interface for text generation using various LLM providers.
    By default, it uses vLLM for efficient local inference.

    Args:
        text (String Expression):
            The input text column to generate from
        model (str, default="facebook/opt-125m"):
            The model identifier to use for generation
        provider (str, default="vllm"):
            The LLM provider to use for generation. Supported values: "vllm", "openai"
        concurrency (int, default=1):
            The number of concurrent instances of the model to run
        batch_size (int, default=None):
            The batch size for the UDF. If None, the batch size will be determined by defaults based on the provider.
        num_cpus (int, default=None):
            The number of CPUs to use for the UDF
        num_gpus (int, default=None):
            The number of GPUs to use for the UDF
        generation_config (dict, default={}):
            Configuration parameters for text generation (e.g., temperature, max_tokens)

    Returns:
        Expression (String Expression): The generated text column

    Examples:
        Use vLLM provider:
        >>> import daft
        >>> from daft import col
        >>> from daft.functions import llm_generate, format
        >>>
        >>> df = daft.from_pydict({"city": ["Paris", "Tokyo", "New York"]})
        >>> df = df.with_column(
        ...     "description",
        ...     llm_generate(
        ...         format(
        ...             "Describe the main attractions and unique features of this city: {}.",
        ...             col("city"),
        ...         ),
        ...         model="facebook/opt-125m",
        ...     ),
        ... )
        >>> df.collect()

        Use OpenAI provider:
        >>> df = daft.from_pydict({"city": ["Paris", "Tokyo", "New York"]})
        >>> df = df.with_column(
        ...     "description",
        ...     llm_generate(
        ...         format(
        ...             "Describe the main attractions and unique features of this city: {}.",
        ...             col("city"),
        ...         ),
        ...         model="gpt-4o",
        ...         api_key="xxx",
        ...         provider="openai",
        ...     ),
        ... )
        >>> df.collect()

    Note:
        Make sure the required provider packages are installed (e.g. vllm, transformers, openai).
    """
    warnings.warn(
        "This method is deprecated and will be removed in v0.8.0. Use daft.functions.prompt instead.",
        DeprecationWarning,
        stacklevel=2,
    )

    cls: Any = None
    if provider == "vllm":
        cls = _vLLMGenerator
        if batch_size is None:
            batch_size = 1024
    elif provider == "openai":
        cls = _OpenAIGenerator
        if batch_size is None:
            batch_size = 128
    else:
        raise ValueError(f"Unsupported provider: {provider}")

    llm_generator = udf(
        return_dtype=DataType.string(),
        batch_size=batch_size,
        num_cpus=num_cpus,
        num_gpus=num_gpus,
        concurrency=concurrency,
    )(cls).with_init_args(
        model=model,
        generation_config=generation_config,
    )

    return llm_generator(text)