Summary
The user is experiencing potential issues with the MySQL certified connector in Airbyte when performing incremental extracts using Cursor, leading to significant data growth in the database due to sorting operations. The user is concerned about the impact on large databases.
Question
Hello,
I am working with the MySQL certified connector and noticed a behavior that could be potentially problematic. Incremental extracts using Cursor are sorting the data at the database and in cases where tables are in the TBs and an initial (first) pull is conducted it could lead to large amount of disk spills and growth of the data disk of the DB. I understand the benefit of letting the DB doing the sort than the client, but this also means that Airbyte cannot be used to interact with large databases. I’ve seen several of my databases grow from 8TB to ~30TB due to the way Airbyte operates.
Is this a known issue?
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.