Partitioning Destination Table in BigQuery for Time-Specific Queries

Summary

The user is seeking advice on optimizing the partitioning of the destination table in BigQuery for time-specific queries when syncing data from Klaviyo using Airbyte. They are exploring alternatives to the current partitioning by airbyte_extracted_at column to improve performance and cost efficiency.


Question

Question about partitioning the destination table in BigQuery:
I’m using Airbyte to sync data daily from Klaviyo to BigQuery. Currently, data is partitioned by the airbyte_extracted_at column, but for our use cases—primarily time-specific queries—this approach isn’t optimal in terms of performance and cost, as it requires reading the entire table.

I’ve considered manually creating a table with the desired partitioning on event timestamps but am concerned about potential impacts on sync efficiency and cost.

Does anyone have experience with this scenario or alternative approaches, such as post-sync data movement to an appropriately partitioned table?
Looking for best practices or insights from the community. Thanks!



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want to access the original thread.

Join the conversation on Slack

["partitioning", "bigquery", "klaviyo", "airbyte", "sync", "time-specific-queries", "performance", "cost-efficiency"]