Summary
User reports a potential issue with the MySQL certified connector where incremental extracts using Cursor lead to excessive disk usage due to sorting at the database level, particularly with large tables.
Question
Hello,
I am working with the MySQL certified connector and noticed a behavior that could be potentially problematic. Incremental extracts using Cursor are sorting the data at the database and in cases where tables are in the TBs and an initial (first) pull is conducted it could lead to large amount of disk spills and growth of the data disk of the DB. I understand the benefit of letting the DB doing the sort than the client, but this also means that Airbyte cannot be used to interact with large databases. I’ve seen several of my databases grow from 8TB to ~30TB due to the way Airbyte operates.
Is this a known issue?
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.