Skipping Historical Data during Initial Loading in Airbyte OSS

Summary

When using Airbyte OSS with large data sources like MySQL and BigQuery, how to skip historical data during initial loading without creating views in the production database.


Question

Hello Folks,

I am using Airbyte OSS, which I deployed on GCP Compute Engine. Then I created a pipeline with MySQL as the data source and BigQuery as the destination.

I have one table with a size of 2TB, which is too large to load all the data at once. How can I skip historical data during the initial loading?

I have already asked Ask-AI, and they suggested creating a view, but we are not allowed to create views in the production database.

Has anyone experienced a similar case to mine? And how did you resolve it?

thank you



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

["airbyte-oss", "mysql", "bigquery", "initial-loading", "skip-historical-data", "large-data", "production-database"]

On Settings tab for a connection, there is Connection state > that you can expand. For example, you can run streams for small tables and figure out what you need change in connection state to “mark” that 2TB table as synchronized to specific point/position.

great thank you <@U05JENRCF7C>
it works :slight_smile: