daft.functions.run_process#
run_process #
run_process(args: Expression | list[Expression | Any], *, shell: bool = False, on_error: Literal['raise', 'ignore', 'log'] = 'log', return_dtype: DataTypeLike = string()) -> Expression
Returns an expression that runs an external process (optionally via a shell) and exposes its stdout as a column.
This helper wraps a Python UDF around subprocess.run so it can be used inside DataFrame expressions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
args | Expression | list[Expression | Any] | The command to execute. If | required |
shell | bool, default=False | Whether to execute the command via the system shell (equivalent to | False |
on_error | Literal["raise", "ignore", "log"], default="log" | Whether to log an error when encountering an error, or log a warning and return a null | 'log' |
return_dtype | DataTypeLike | Desired Daft data type for the result column. Defaults to a UTF-8 string column. | string() |
Returns:
| Name | Type | Description |
|---|---|---|
Expression | Expression | An expression representing the stdout of the process converted to |
Examples:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
'hello world'
[{'word_count': 12}] Source code in daft/functions/process.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | |