Skip to content

Sessions#

Sessions enable you to attach objects such as catalogs, providers, models, functions, and tables which are reference-able in DataFrame operations. The session also enables creating temporary objects which are accessible through both the Python and SQL APIs. Sessions hold configuration state such as current_catalog and current_namespace which are used in name resolution and can simplify your workflows. Learn more about Sessions in the Daft User Guide.

session #

session() -> Session

Creates a default daft session to be used with a context manager.

Examples:

1
2
3
4
>>> import daft
>>>
>>> with daft.session() as sess:
>>>     sess.sql("SELECT 1")
Source code in daft/session.py
74
75
76
77
78
79
80
81
82
83
def session() -> Session:
    """Creates a default daft session to be used with a context manager.

    Examples:
        >>> import daft
        >>>
        >>> with daft.session() as sess:
        >>>     sess.sql("SELECT 1")  # doctest: +SKIP
    """
    return Session()

current_session #

current_session() -> Session

Returns the active session's current session.

Source code in daft/session.py
900
901
902
def current_session() -> Session:
    """Returns the active session's current session."""
    return _session()

Session #

Session()

Session holds a connection's state and orchestrates execution of DataFrame and SQL queries against catalogs.

Examples:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
>>> import daft
>>> from daft.session import Session
>>>
>>> sess = Session()
>>>
>>> # Create a temporary table from a DataFrame
>>> sess.create_temp_table("T", daft.from_pydict({"x": [1, 2, 3]}))
>>>
>>> # Read the table as a DataFrame
>>> df = sess.read_table("T")
>>>
>>> # Execute an SQL query
>>> sess.sql("SELECT * FROM T").show()
>>>
>>> # You can also retrieve the current session without creating a new one:
>>> from daft.session import current_session
>>> sess = current_session()

Methods:

Name Description
attach

Attaches a known attachable object like a Catalog, Table, UDF, or DataFrame.

attach_catalog

Attaches an external catalog to this session.

attach_function

Attaches a Python function as a UDF in the current session.

attach_provider

Attaches a provider instance to this session.

attach_table

Attaches an external table instance to this session.

attach_view

Attaches a DataFrame as a non-materialized temporary view for SQL resolution.

create_namespace

Creates a namespace in the current catalog.

create_namespace_if_not_exists

Creates a namespace in the current catalog if it does not already exist.

create_table

Creates a table in the current catalog.

create_table_if_not_exists

Creates a table in the current catalog if it does not already exist.

create_temp_table

Creates a temp table scoped to this session's lifetime.

create_temp_view

Creates or replaces a non-materialized temporary view from a DataFrame.

current_catalog

Get the session's current catalog or None.

current_model

Get the session's current model or None.

current_namespace

Get the session's current namespace or None.

current_provider

Get the session's current provider or None.

detach_catalog

Detaches the catalog from this session or raises if the catalog does not exist.

detach_function

Detaches a Python function as a UDF in the current session.

detach_provider

Detaches the provider from this session or raises if the provider does not exist.

detach_table

Detaches the table from this session or raises if the table does not exist.

drop_namespace

Drop the given namespace in the current catalog.

drop_table

Drop the given table in the current catalog.

get_aggregate_function

Returns an aggregate function expression from the current session.

get_catalog

Returns the catalog or raises an exception if it does not exist.

get_function

Returns the function from the current session or raises an exception if it does not exist.

get_provider

Returns the provider or raises an exception if it does not exist.

get_table

Returns the table or raises an exception if it does not exist.

has_catalog

Returns true if a catalog with the given identifier exists.

has_namespace

Returns true if a namespace with the given identifier exists.

has_provider

Returns true if a provider with the given identifier exists.

has_table

Returns true if a table with the given identifier exists.

list_catalogs

Returns a list of available catalogs matching the pattern.

list_namespaces

Returns a list of matching namespaces in the current catalog.

list_tables

Returns a list of available tables.

load_extension

Load a native extension by module symbol or an explicit file path.

read_table

Returns the table as a DataFrame or raises an exception if it does not exist.

set_catalog

Set the given catalog as current_catalog or raises an err if it does not exist.

set_model

Set the default model type.

set_namespace

Set the given namespace as current_namespace for table resolution.

set_provider

Set the default model provider with associated options.

sql

Executes the SQL statement using this session.

use

Use sets the current catalog and namespace.

write_table

Writes the DataFrame to the table specified by the identifier.

Source code in daft/session.py
111
112
113
def __init__(self) -> None:
    self._session = PySession.empty()
    self._token: Token[Session | None] | None = None

attach #

attach(object: Catalog | Provider | Table | UDF | DataFrame, alias: str | None = None) -> None

Attaches a known attachable object like a Catalog, Table, UDF, or DataFrame.

Parameters:

Name Type Description Default
object Catalog | Table | UDF | DataFrame

object which is attachable to a session

required

Returns:

Type Description
None

None

Source code in daft/session.py
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
def attach(self, object: Catalog | Provider | Table | UDF | DataFrame, alias: str | None = None) -> None:
    """Attaches a known attachable object like a Catalog, Table, UDF, or DataFrame.

    Args:
        object (Catalog|Table|UDF|DataFrame): object which is attachable to a session

    Returns:
        None
    """
    if isinstance(object, Catalog):
        self.attach_catalog(object, alias)
    elif isinstance(object, Provider):
        self.attach_provider(object, alias)
    elif isinstance(object, Table):
        self.attach_table(object, alias)
    elif isinstance(object, UDF):
        self.attach_function(object, alias)
    elif isinstance(object, DataFrame):
        if alias is None:
            raise ValueError("Cannot attach a DataFrame without an alias. Please provide `alias`.")
        self.attach_view(object, alias)
    else:
        raise ValueError(f"Cannot attach object with type {type(object)}")

attach_catalog #

attach_catalog(catalog: Catalog | object, alias: str | None = None) -> Catalog

Attaches an external catalog to this session.

Parameters:

Name Type Description Default
catalog object

catalog instance or supported catalog object

required
alias str | None

optional alias for name resolution

None

Returns:

Name Type Description
Catalog Catalog

new daft catalog instance

Source code in daft/session.py
200
201
202
203
204
205
206
207
208
209
210
211
212
213
def attach_catalog(self, catalog: Catalog | object, alias: str | None = None) -> Catalog:
    """Attaches an external catalog to this session.

    Args:
        catalog (object): catalog instance or supported catalog object
        alias (str|None): optional alias for name resolution

    Returns:
        Catalog: new daft catalog instance
    """
    c = catalog if isinstance(catalog, Catalog) else Catalog._from_obj(catalog)
    a = alias if alias else c.name
    self._session.attach_catalog(c, a)
    return c

attach_function #

attach_function(function: UDF, alias: str | None = None) -> None

Attaches a Python function as a UDF in the current session.

Source code in daft/session.py
215
216
217
def attach_function(self, function: UDF, alias: str | None = None) -> None:
    """Attaches a Python function as a UDF in the current session."""
    self._session.attach_function(function, alias)

attach_provider #

attach_provider(provider: Provider, alias: str | None = None) -> Provider

Attaches a provider instance to this session.

Parameters:

Name Type Description Default
provider Provider

provider instance

required
alias str | None

optional alias for name resolution

None

Returns:

Name Type Description
Provider Provider

the provider instance

Source code in daft/session.py
219
220
221
222
223
224
225
226
227
228
229
230
231
232
def attach_provider(self, provider: Provider, alias: str | None = None) -> Provider:
    """Attaches a provider instance to this session.

    Args:
        provider (Provider): provider instance
        alias (str | None): optional alias for name resolution

    Returns:
        Provider: the provider instance
    """
    p = provider  # TODO: support attaching provider-like objects e.g. OpenAI client.
    a = alias if alias else p.name
    self._session.attach_provider(p, a)
    return p

attach_table #

attach_table(table: Table | object, alias: str | None = None) -> Table

Attaches an external table instance to this session.

Parameters:

Name Type Description Default
table Table | object

table instance or supported table object

required
alias str | None

optional alias for name resolution

None

Returns:

Name Type Description
Table Table

new daft table instance

Source code in daft/session.py
234
235
236
237
238
239
240
241
242
243
244
245
246
247
def attach_table(self, table: Table | object, alias: str | None = None) -> Table:
    """Attaches an external table instance to this session.

    Args:
        table (Table | object): table instance or supported table object
        alias (str | None): optional alias for name resolution

    Returns:
        Table: new daft table instance
    """
    t = table if isinstance(table, Table) else Table._from_obj(table)
    a = alias if alias else t.name
    self._session.attach_table(t, a)
    return t

attach_view #

attach_view(view: DataFrame, alias: str) -> Table

Attaches a DataFrame as a non-materialized temporary view for SQL resolution.

Unlike attach_table(Table.from_df(...)), this does not materialize data into a MemoryTable.

Source code in daft/session.py
249
250
251
252
253
254
255
def attach_view(self, view: DataFrame, alias: str) -> Table:
    """Attaches a DataFrame as a non-materialized temporary view for SQL resolution.

    Unlike ``attach_table(Table.from_df(...))``, this does not materialize data into a MemoryTable.
    """
    py_source = self._to_py_table_source(view)
    return self._create_temp_table_with_source(alias, py_source, replace=False)

create_namespace #

create_namespace(identifier: Identifier | str) -> None

Creates a namespace in the current catalog.

Source code in daft/session.py
317
318
319
320
321
def create_namespace(self, identifier: Identifier | str) -> None:
    """Creates a namespace in the current catalog."""
    if not (catalog := self.current_catalog()):
        raise ValueError("Cannot create a namespace without a current catalog")
    return catalog.create_namespace(identifier)

create_namespace_if_not_exists #

create_namespace_if_not_exists(identifier: Identifier | str) -> None

Creates a namespace in the current catalog if it does not already exist.

Source code in daft/session.py
323
324
325
326
327
def create_namespace_if_not_exists(self, identifier: Identifier | str) -> None:
    """Creates a namespace in the current catalog if it does not already exist."""
    if not (catalog := self.current_catalog()):
        raise ValueError("Cannot create a namespace without a current catalog")
    return catalog.create_namespace_if_not_exists(identifier)

create_table #

create_table(identifier: Identifier | str, source: Schema | DataFrame, **properties: Any) -> Table

Creates a table in the current catalog.

If no namespace is specified, the current namespace is used.

Returns:

Name Type Description
Table Table

the newly created table instance.

Source code in daft/session.py
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
def create_table(self, identifier: Identifier | str, source: Schema | DataFrame, **properties: Any) -> Table:
    """Creates a table in the current catalog.

    If no namespace is specified, the current namespace is used.

    Returns:
        Table: the newly created table instance.
    """
    if isinstance(identifier, str):
        identifier = Identifier.from_str(identifier)

    if resolved := self._resolve_catalog(identifier):
        cat, identifier = resolved
        return cat.create_table(identifier, source, properties)

    if not (catalog := self.current_catalog()):
        # TODO relax this constraint by joining with the catalog name
        raise ValueError("Cannot create a table without a current catalog")

    if len(identifier) == 1:
        if ns := self.current_namespace():
            identifier = ns + identifier

    return catalog.create_table(identifier, source, properties)

create_table_if_not_exists #

create_table_if_not_exists(identifier: Identifier | str, source: Schema | DataFrame, **properties: Any) -> Table

Creates a table in the current catalog if it does not already exist.

If no namespace is specified, the current namespace is used.

Returns:

Name Type Description
Table Table

the newly created instance, or the existing table instance.

Source code in daft/session.py
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
def create_table_if_not_exists(
    self,
    identifier: Identifier | str,
    source: Schema | DataFrame,
    **properties: Any,
) -> Table:
    """Creates a table in the current catalog if it does not already exist.

    If no namespace is specified, the current namespace is used.

    Returns:
        Table: the newly created instance, or the existing table instance.
    """
    if isinstance(identifier, str):
        identifier = Identifier.from_str(identifier)

    if resolved := self._resolve_catalog(identifier):
        cat, identifier = resolved
        return cat.create_table_if_not_exists(identifier, source, properties)

    if not (catalog := self.current_catalog()):
        # TODO relax this constraint by joining with the catalog name
        raise ValueError("Cannot create a table without a current catalog")

    if len(identifier) == 1:
        if ns := self.current_namespace():
            identifier = ns + identifier

    return catalog.create_table_if_not_exists(identifier, source, properties)

create_temp_table #

create_temp_table(identifier: str, source: Schema | DataFrame) -> Table

Creates a temp table scoped to this session's lifetime.

Parameters:

Name Type Description Default
identifier str

table identifier (name)

required
source TableSource | object

table source like a schema or dataframe

required

Returns:

Name Type Description
Table Table

new table instance

Examples:

1
2
3
4
5
6
>>> import daft
>>> from daft.session import Session
>>> sess = Session()
>>> sess.create_temp_table("T", daft.from_pydict({"x": [1, 2, 3]}))
>>> sess.create_temp_table("S", daft.from_pydict({"y": [4, 5, 6]}))
>>> sess.list_tables()
[Identifier(''T''), Identifier(''S'')]

Parameters:

Name Type Description Default
identifier str

table identifier (name)

required
source Schema | DataFrame

table source is either a Schema or Dataframe

required

Returns:

Name Type Description
Table Table

new table instance

Source code in daft/session.py
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
def create_temp_table(self, identifier: str, source: Schema | DataFrame) -> Table:
    """Creates a temp table scoped to this session's lifetime.

    Args:
        identifier (str): table identifier (name)
        source (TableSource|object): table source like a schema or dataframe

    Returns:
        Table: new table instance

    Examples:
        >>> import daft
        >>> from daft.session import Session
        >>> sess = Session()
        >>> sess.create_temp_table("T", daft.from_pydict({"x": [1, 2, 3]}))
        >>> sess.create_temp_table("S", daft.from_pydict({"y": [4, 5, 6]}))
        >>> sess.list_tables()
        [Identifier(''T''), Identifier(''S'')]

    Args:
        identifier (str): table identifier (name)
        source (Schema | DataFrame): table source is either a Schema or Dataframe

    Returns:
        Table: new table instance
    """
    py_source = self._to_py_table_source(source)
    return self._create_temp_table_with_source(identifier, py_source, replace=True)

create_temp_view #

create_temp_view(identifier: str, view: DataFrame) -> Table

Creates or replaces a non-materialized temporary view from a DataFrame.

Source code in daft/session.py
419
420
421
422
def create_temp_view(self, identifier: str, view: DataFrame) -> Table:
    """Creates or replaces a non-materialized temporary view from a DataFrame."""
    py_source = self._to_py_table_source(view)
    return self._create_temp_table_with_source(identifier, py_source, replace=True)

current_catalog #

current_catalog() -> Catalog | None

Get the session's current catalog or None.

Returns:

Name Type Description
Catalog Catalog | None

current catalog or None if one is not set

Source code in daft/session.py
484
485
486
487
488
489
490
def current_catalog(self) -> Catalog | None:
    """Get the session's current catalog or None.

    Returns:
        Catalog: current catalog or None if one is not set
    """
    return self._session.current_catalog()

current_model #

current_model() -> str | None

Get the session's current model or None.

Returns:

Name Type Description
str str | None

the session's default model identifier

Source code in daft/session.py
509
510
511
512
513
514
515
def current_model(self) -> str | None:
    """Get the session's current model or None.

    Returns:
        str: the session's default model identifier
    """
    return self._session.current_model()

current_namespace #

current_namespace() -> Identifier | None

Get the session's current namespace or None.

Returns:

Name Type Description
Identifier Identifier | None

current namespace or none if one is not set

Source code in daft/session.py
492
493
494
495
496
497
498
499
def current_namespace(self) -> Identifier | None:
    """Get the session's current namespace or None.

    Returns:
        Identifier: current namespace or none if one is not set
    """
    ident = self._session.current_namespace()
    return Identifier._from_pyidentifier(ident) if ident else None

current_provider #

current_provider() -> Provider | None

Get the session's current provider or None.

Returns:

Name Type Description
str Provider | None

the session's default provider identifier

Source code in daft/session.py
501
502
503
504
505
506
507
def current_provider(self) -> Provider | None:
    """Get the session's current provider or None.

    Returns:
        str: the session's default provider identifier
    """
    return self._session.current_provider()

detach_catalog #

detach_catalog(alias: str) -> None

Detaches the catalog from this session or raises if the catalog does not exist.

Parameters:

Name Type Description Default
alias str

catalog alias to detach

required
Source code in daft/session.py
257
258
259
260
261
262
263
def detach_catalog(self, alias: str) -> None:
    """Detaches the catalog from this session or raises if the catalog does not exist.

    Args:
        alias (str): catalog alias to detach
    """
    return self._session.detach_catalog(alias)

detach_function #

detach_function(alias: str) -> None

Detaches a Python function as a UDF in the current session.

Source code in daft/session.py
265
266
267
def detach_function(self, alias: str) -> None:
    """Detaches a Python function as a UDF in the current session."""
    self._session.detach_function(alias)

detach_provider #

detach_provider(alias: str) -> None

Detaches the provider from this session or raises if the provider does not exist.

Parameters:

Name Type Description Default
alias str

provider alias to detach

required
Source code in daft/session.py
297
298
299
300
301
302
303
def detach_provider(self, alias: str) -> None:
    """Detaches the provider from this session or raises if the provider does not exist.

    Args:
        alias (str): provider alias to detach
    """
    return self._session.detach_provider(alias)

detach_table #

detach_table(alias: str) -> None

Detaches the table from this session or raises if the table does not exist.

Parameters:

Name Type Description Default
alias str

catalog alias to detach

required
Source code in daft/session.py
305
306
307
308
309
310
311
def detach_table(self, alias: str) -> None:
    """Detaches the table from this session or raises if the table does not exist.

    Args:
        alias (str): catalog alias to detach
    """
    return self._session.detach_table(alias)

drop_namespace #

drop_namespace(identifier: Identifier | str) -> None

Drop the given namespace in the current catalog.

Parameters:

Name Type Description Default
identifier Identifier | str

table identifier

required
Source code in daft/session.py
438
439
440
441
442
443
444
445
446
def drop_namespace(self, identifier: Identifier | str) -> None:
    """Drop the given namespace in the current catalog.

    Args:
        identifier (Identifier|str): table identifier
    """
    if not (catalog := self.current_catalog()):
        raise ValueError("Cannot drop a namespace without a current catalog")
    return catalog.drop_namespace(identifier)

drop_table #

drop_table(identifier: Identifier | str) -> None

Drop the given table in the current catalog.

Parameters:

Name Type Description Default
identifier Identifier | str

table identifier

required
Source code in daft/session.py
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
def drop_table(self, identifier: Identifier | str) -> None:
    """Drop the given table in the current catalog.

    Args:
        identifier (Identifier|str): table identifier
    """
    if isinstance(identifier, str):
        identifier = Identifier.from_str(identifier)

    if resolved := self._resolve_catalog(identifier):
        cat, identifier = resolved
        return cat.drop_table(identifier)

    if not (catalog := self.current_catalog()):
        raise ValueError("Cannot drop a table without a current catalog")
    # TODO join the identifier with the current namespace
    return catalog.drop_table(identifier)

get_aggregate_function #

get_aggregate_function(name: str, *args: Expression) -> Expression

Returns an aggregate function expression from the current session.

Parameters:

Name Type Description Default
name str

aggregate function name as registered by an extension

required
*args Expression

Expression arguments to pass to the aggregate function

()

Returns:

Name Type Description
Expression Expression

aggregate result expression

Source code in daft/session.py
577
578
579
580
581
582
583
584
585
586
587
def get_aggregate_function(self, name: str, *args: Expression) -> Expression:
    """Returns an aggregate function expression from the current session.

    Args:
        name (str): aggregate function name as registered by an extension
        *args: Expression arguments to pass to the aggregate function

    Returns:
        Expression: aggregate result expression
    """
    return Expression._from_pyexpr(self._session.get_aggregate_function(name, *[a._expr for a in args]))

get_catalog #

get_catalog(identifier: str) -> Catalog

Returns the catalog or raises an exception if it does not exist.

Parameters:

Name Type Description Default
identifier str

catalog identifier (name)

required

Returns:

Name Type Description
Catalog Catalog

The catalog object.

Raises:

Type Description
ValueError

If the catalog does not exist.

Source code in daft/session.py
521
522
523
524
525
526
527
528
529
530
531
532
533
def get_catalog(self, identifier: str) -> Catalog:
    """Returns the catalog or raises an exception if it does not exist.

    Args:
        identifier (str): catalog identifier (name)

    Returns:
        Catalog: The catalog object.

    Raises:
        ValueError: If the catalog does not exist.
    """
    return self._session.get_catalog(identifier)

get_function #

get_function(name: str, *args: Expression) -> Expression

Returns the function from the current session or raises an exception if it does not exist.

Parameters:

Name Type Description Default
name str

function name as registered by an extension

required
*args Expression

Expression arguments to pass to the function

()

Returns:

Name Type Description
Expression Expression

result expression

Source code in daft/session.py
565
566
567
568
569
570
571
572
573
574
575
def get_function(self, name: str, *args: Expression) -> Expression:
    """Returns the function from the current session or raises an exception if it does not exist.

    Args:
        name (str): function name as registered by an extension
        *args: Expression arguments to pass to the function

    Returns:
        Expression: result expression
    """
    return Expression._from_pyexpr(self._session.get_function(name, *[a._expr for a in args]))

get_provider #

get_provider(identifier: str) -> Provider

Returns the provider or raises an exception if it does not exist.

Parameters:

Name Type Description Default
identifier str

provider identifier e.g. "openai", "anthropic", "transformers"

required

Returns:

Name Type Description
Provider Provider

The provider object.

Raises:

Type Description
ValueError

If the provider does not exist.

Source code in daft/session.py
535
536
537
538
539
540
541
542
543
544
545
546
547
def get_provider(self, identifier: str) -> Provider:
    """Returns the provider or raises an exception if it does not exist.

    Args:
        identifier (str): provider identifier e.g. "openai", "anthropic", "transformers"

    Returns:
        Provider: The provider object.

    Raises:
        ValueError: If the provider does not exist.
    """
    return self._session.get_provider(identifier)

get_table #

get_table(identifier: Identifier | str) -> Table

Returns the table or raises an exception if it does not exist.

Parameters:

Name Type Description Default
identifier Identifier | str

table identifier or identifier string

required

Returns:

Name Type Description
Table Table

The table object.

Raises:

Type Description
ValueError

If the table does not exist.

Source code in daft/session.py
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
def get_table(self, identifier: Identifier | str) -> Table:
    """Returns the table or raises an exception if it does not exist.

    Args:
        identifier (Identifier|str): table identifier or identifier string

    Returns:
        Table: The table object.

    Raises:
        ValueError: If the table does not exist.
    """
    if isinstance(identifier, str):
        identifier = Identifier.from_str(identifier)
    return self._session.get_table(identifier._ident)

has_catalog #

has_catalog(identifier: str) -> bool

Returns true if a catalog with the given identifier exists.

Source code in daft/session.py
593
594
595
def has_catalog(self, identifier: str) -> bool:
    """Returns true if a catalog with the given identifier exists."""
    return self._session.has_catalog(identifier)

has_namespace #

has_namespace(identifier: Identifier | str) -> bool

Returns true if a namespace with the given identifier exists.

Source code in daft/session.py
597
598
599
600
601
def has_namespace(self, identifier: Identifier | str) -> bool:
    """Returns true if a namespace with the given identifier exists."""
    if not (catalog := self.current_catalog()):
        raise ValueError("Cannot call has_namespace without a current catalog")
    return catalog.has_namespace(identifier)

has_provider #

has_provider(identifier: str) -> bool

Returns true if a provider with the given identifier exists.

Source code in daft/session.py
603
604
605
def has_provider(self, identifier: str) -> bool:
    """Returns true if a provider with the given identifier exists."""
    return self._session.has_provider(identifier)

has_table #

has_table(identifier: Identifier | str) -> bool

Returns true if a table with the given identifier exists.

Source code in daft/session.py
607
608
609
610
611
def has_table(self, identifier: Identifier | str) -> bool:
    """Returns true if a table with the given identifier exists."""
    if isinstance(identifier, str):
        identifier = Identifier.from_str(identifier)
    return self._session.has_table(identifier._ident)

list_catalogs #

list_catalogs(pattern: str | None = None) -> list[str]

Returns a list of available catalogs matching the pattern.

This API currently returns a list of catalog names for backwards compatibility. In 0.5.0 this API will return a list of Catalog objects.

Parameters:

Name Type Description Default
pattern str

catalog name pattern

None

Returns:

Type Description
list[str]

list[str]: list of available catalog names

Source code in daft/session.py
617
618
619
620
621
622
623
624
625
626
627
628
629
def list_catalogs(self, pattern: str | None = None) -> list[str]:
    """Returns a list of available catalogs matching the pattern.

    This API currently returns a list of catalog names for backwards compatibility.
    In 0.5.0 this API will return a list of Catalog objects.

    Args:
        pattern (str): catalog name pattern

    Returns:
        list[str]: list of available catalog names
    """
    return self._session.list_catalogs(pattern)

list_namespaces #

list_namespaces(pattern: str | None = None) -> list[Identifier]

Returns a list of matching namespaces in the current catalog.

Source code in daft/session.py
631
632
633
634
635
def list_namespaces(self, pattern: str | None = None) -> list[Identifier]:
    """Returns a list of matching namespaces in the current catalog."""
    if not (catalog := self.current_catalog()):
        raise ValueError("Cannot list namespaces without a current catalog")
    return catalog.list_namespaces(pattern)

list_tables #

list_tables(pattern: str | None = None) -> list[Identifier]

Returns a list of available tables.

Parameters:

Name Type Description Default
- pattern str

Pattern to match table names. Pattern syntax is catalog-dependent: - Native/Memory and Postgres catalogs: Use SQL LIKE syntax (%, _, \). Supports qualified patterns like "ns1.table%". - Other catalogs: Pattern behavior varies (e.g., prefix matching for Iceberg/S3 Tables, AWS Glue expressions for Glue).

required

Returns:

Type Description
list[Identifier]

list[Identifier]: list of available tables

Examples:

1
2
3
>>> sess.list_tables()  # List all tables
>>> sess.list_tables("table%")  # Tables starting with "table" (native catalog)
>>> sess.list_tables("ns1.%")  # All tables in namespace "ns1" (native catalog)
Source code in daft/session.py
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
def list_tables(self, pattern: str | None = None) -> list[Identifier]:
    r"""Returns a list of available tables.

    Args:
        - pattern (str, optional): Pattern to match table names. Pattern syntax is catalog-dependent:
            - Native/Memory and Postgres catalogs: Use SQL LIKE syntax (`%`, `_`, `\`). Supports qualified patterns like `"ns1.table%"`.
            - Other catalogs: Pattern behavior varies (e.g., prefix matching for Iceberg/S3 Tables, AWS Glue expressions for Glue).

    Returns:
        list[Identifier]: list of available tables

    Examples:
        >>> sess.list_tables()  # List all tables
        >>> sess.list_tables("table%")  # Tables starting with "table" (native catalog)
        >>> sess.list_tables("ns1.%")  # All tables in namespace "ns1" (native catalog)
    """
    return [Identifier._from_pyidentifier(i) for i in self._session.list_tables(pattern)]

load_extension #

load_extension(extension: str | ModuleType | Path) -> None

Load a native extension by module symbol or an explicit file path.

.. warning:: This API is experimental and may change in future releases.

Parameters:

Name Type Description Default
extension str | ModuleType | Path

A module with a native library or a direct file path to the shared library.

required
Source code in daft/session.py
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
def load_extension(self, extension: str | types.ModuleType | Path) -> None:
    """Load a native extension by module symbol or an explicit file path.

    .. warning::
        This API is experimental and may change in future releases.

    Args:
        extension: A module with a native library or a direct file path to the shared library.
    """
    warnings.warn(
        "Native extensions are experimental and may change in future releases.",
        stacklevel=2,
    )
    if isinstance(extension, str):
        path = extension
    elif isinstance(extension, Path):
        path = str(extension)
    elif isinstance(extension, types.ModuleType):
        path = _get_shared_lib(extension)
    else:
        raise TypeError(f"Expected string, Path, or module, got {type(extension)}")

    # Load the shared library globally so that symbols are visible to other libraries.
    ctypes.CDLL(path, mode=ctypes.RTLD_GLOBAL)

    # Load the extension into the session on the Rust side.
    self._session.load_extension(path)

read_table #

read_table(identifier: Identifier | str, **options: Any) -> DataFrame

Returns the table as a DataFrame or raises an exception if it does not exist.

Parameters:

Name Type Description Default
identifier Identifier | str

table identifier

required

Returns:

Name Type Description
DataFrame DataFrame

Raises:

Type Description
ValueError

If the tables does not exist.

Source code in daft/session.py
659
660
661
662
663
664
665
666
667
668
669
670
671
def read_table(self, identifier: Identifier | str, **options: Any) -> DataFrame:
    """Returns the table as a DataFrame or raises an exception if it does not exist.

    Args:
        identifier (Identifier|str): table identifier

    Returns:
        DataFrame:

    Raises:
        ValueError: If the tables does not exist.
    """
    return self.get_table(identifier).read(**options)

set_catalog #

set_catalog(identifier: str | None) -> None

Set the given catalog as current_catalog or raises an err if it does not exist.

Parameters:

Name Type Description Default
identifier str

sets the current catalog

required

Raises:

Type Description
ValueError

If the catalog does not exist.

Source code in daft/session.py
677
678
679
680
681
682
683
684
685
686
def set_catalog(self, identifier: str | None) -> None:
    """Set the given catalog as current_catalog or raises an err if it does not exist.

    Args:
        identifier (str): sets the current catalog

    Raises:
        ValueError: If the catalog does not exist.
    """
    self._session.set_catalog(identifier)

set_model #

set_model(identifier: str | None) -> None

Set the default model type.

Parameters:

Name Type Description Default
identifier str | None

model identifier string.

required
Source code in daft/session.py
717
718
719
720
721
722
723
def set_model(self, identifier: str | None) -> None:
    """Set the default model type.

    Args:
        identifier (str | None): model identifier string.
    """
    self._session.set_model(identifier)

set_namespace #

set_namespace(identifier: Identifier | str | None) -> None

Set the given namespace as current_namespace for table resolution.

Parameters:

Name Type Description Default
identifier Identifier | str

namespace identifier

required
Source code in daft/session.py
688
689
690
691
692
693
694
695
696
def set_namespace(self, identifier: Identifier | str | None) -> None:
    """Set the given namespace as current_namespace for table resolution.

    Args:
        identifier (Identifier | str): namespace identifier
    """
    if isinstance(identifier, str):
        identifier = Identifier.from_str(identifier)
    self._session.set_namespace(identifier._ident if identifier else None)

set_provider #

set_provider(identifier: str | None, **options: Any) -> None

Set the default model provider with associated options.

Parameters:

Name Type Description Default
identifier str | None

provider identifier string or None.

required
**options Any

provider specific options such as an API key or retry limit.

{}
Note

If there are no providers, and you give a known provider identifier like "openai", then we will create and attach this known provider. For example, daft.set_provider("openai") works.

Source code in daft/session.py
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
def set_provider(self, identifier: str | None, **options: Any) -> None:
    """Set the default model provider with associated options.

    Args:
        identifier (str | None): provider identifier string or None.
        **options (Any): provider specific options such as an API key or retry limit.

    Note:
        If there are no providers, and you give a known provider identifier
        like "openai", then we will create and attach this known provider.
        For example, `daft.set_provider("openai")` works.
    """
    # consider using @overload on known providers for better type hints
    if identifier is not None and not self._session.has_provider(identifier) and identifier in PROVIDERS:
        # upsert semantic for known providers e.g. daft.set_provider("openai")
        provider = load_provider(identifier, name=None, **options)
        self.attach_provider(provider)
    self._session.set_provider(identifier)

sql #

sql(sql: str) -> DataFrame | None

Executes the SQL statement using this session.

Parameters:

Name Type Description Default
sql str

input SQL statement

required

Returns:

Name Type Description
DataFrame DataFrame | None

dataframe instance if this was a data statement (DQL, DDL, DML).

Source code in daft/session.py
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
def sql(self, sql: str) -> DataFrame | None:
    """Executes the SQL statement using this session.

    Args:
        sql (str): input SQL statement

    Returns:
        DataFrame: dataframe instance if this was a data statement (DQL, DDL, DML).
    """
    py_sess = self._session
    py_config = get_context().daft_planning_config
    py_object = sql_exec(sql, py_sess, {}, py_config)
    if py_object is None:
        return None
    elif isinstance(py_object, PyBuilder):
        return DataFrame(LogicalPlanBuilder(py_object))
    else:
        raise ValueError(f"Unsupported return type from sql exec: {type(py_object)}")

use #

use(identifier: Identifier | str | None = None) -> None

Use sets the current catalog and namespace.

Source code in daft/session.py
470
471
472
473
474
475
476
477
478
479
480
481
482
def use(self, identifier: Identifier | str | None = None) -> None:
    """Use sets the current catalog and namespace."""
    if identifier is None:
        self.set_catalog(None)
        self.set_namespace(None)
        return
    if isinstance(identifier, str):
        identifier = Identifier.from_str(identifier)
    if len(identifier) == 1:
        self.set_catalog(str(identifier[0]))
    else:
        self.set_catalog(str(identifier[0]))
        self.set_namespace(identifier.drop(1))

write_table #

write_table(identifier: Identifier | str, df: DataFrame, mode: Literal['append', 'overwrite'] = 'append', **options: dict[str, Any]) -> None

Writes the DataFrame to the table specified by the identifier.

Parameters:

Name Type Description Default
identifier Identifier | str

table identifier

required
df DataFrame

dataframe to write

required
mode 'append' | 'overwrite'

write mode, defaults to "append"

'append'
**options dict[str, Any]

additional, format-specific write options

{}
Source code in daft/session.py
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
def write_table(
    self,
    identifier: Identifier | str,
    df: DataFrame,
    mode: Literal["append", "overwrite"] = "append",
    **options: dict[str, Any],
) -> None:
    """Writes the DataFrame to the table specified by the identifier.

    Args:
        identifier (Identifier|str): table identifier
        df (DataFrame): dataframe to write
        mode ("append"|"overwrite"): write mode, defaults to "append"
        **options (dict[str,Any]): additional, format-specific write options
    """
    if isinstance(identifier, str):
        identifier = Identifier.from_str(identifier)

    self._session.get_table(identifier._ident).write(df, mode=mode, **options)