Writing to Google Cloud Bigtable#
Experimental
This connector is experimental and the API may change in future releases.
Google Cloud Bigtable is a fully managed, scalable NoSQL database service. Daft can write DataFrames to Bigtable tables using df.write_bigtable().
Installing Dependencies#
Bigtable support requires the google-cloud-bigtable package:
1 | |
Basic Usage#
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | |
Key Concepts#
Row Keys#
Every Bigtable row requires a unique row key. Use row_key_column to specify which DataFrame column should be used as the row key:
1 2 3 4 | |
Column Families#
Bigtable organizes columns into column families. Use column_family_mappings to specify which family each column belongs to:
1 2 3 4 5 6 7 8 | |
The column families must already exist in the Bigtable table.
Parameters#
| Parameter | Type | Required | Description |
|---|---|---|---|
project_id | str | Yes | Google Cloud project ID |
instance_id | str | Yes | Bigtable instance ID |
table_id | str | Yes | Bigtable table ID |
row_key_column | str | Yes | Column name to use as the row key |
column_family_mappings | dict[str, str] | Yes | Mapping of column names to column families |
client_kwargs | dict | No | Additional arguments for the Bigtable Client |
write_kwargs | dict | No | Additional arguments for MutationsBatcher |
serialize_incompatible_types | bool | No | Auto-convert incompatible types to JSON (default: True) |
Data Type Handling#
Bigtable cells only accept data that can be converted to bytes. By default, Daft automatically serializes incompatible types to JSON:
1 2 3 4 5 6 7 8 9 10 | |
To disable automatic serialization (will raise an error for incompatible types):
1 2 3 4 | |
Advanced Configuration#
Client Options#
Pass additional options to the Bigtable client:
1 2 3 4 5 6 7 | |
Write Options#
Configure the MutationsBatcher for write operations:
1 2 3 4 5 6 7 | |
Use Cases#
IoT Data Storage#
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | |
User Profile Storage#
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | |
Notes#
- The Bigtable table and column families must exist before writing
- Row keys should be designed carefully for efficient access patterns
- Consider Bigtable's row key design best practices for optimal performance