MySQL CDC to Kafka - capturing only recent changes

  • Is this your first time deploying Airbyte?: No
  • OS Version / Instance: Debian 10 on GCP VM
  • Memory / Disk: 4 GN
  • Deployment: Docker Compose
  • Airbyte Version: 0.39.20-alpha
  • Source name/version: MySQL
  • Destination name/version: Kafka
  • Step: Sync

I am using Airbyte (0.39.20-alpha) to do a CDC against a MySQL table to Kafka. I am actually only interested in the latest updates to the table, but using the MySQL CDC connector, it looks like it is starting to read the table updates from when the database was created, about a year ago. As I don’t particularly care about going that far back in the past, I was wondering if there was a setting to use to just pull the updates as they are happening now.

Hi @Dom_Cimafranca,
We do not have an out-of-the-box solution to support this use case. We usually suggest users filter the data they want to replicate upstream, by creating a view on MySQL which only has the data you wish to replicate.
It could also be possible to edit the state object stored in the airbyte database for this connection with some SQL queries, but this is hacky and prone to error, especially with CDC whose state is a bit more complex than for the other connectors.

FYI an issue requesting this feature was opened here please subscribe to it receive updates!

1 Like

Hi there from the Community Assistance team.
We’re letting you know about an issue we discovered with the back-end process we use to handle topics and responses on the forum. If you experienced a situation where you posted the last message in a topic that did not receive any further replies, please open a new topic to continue the discussion. In addition, if you’re having a problem and find a closed topic on the subject, go ahead and open a new topic on it and we’ll follow up with you. We apologize for the inconvenience, and appreciate your willingness to work with us to provide a supportive community.