Summary
The user is experiencing slow performance and discrepancies in data size during MSSQL to S3 CDC incremental syncs with Airbyte.
Question
Is there anything to do to improve the performance of MSSQL->S3 CDC incremental syncs?
Given just a few tables totalling ~87K records, the initial sync took 7 minutes. A subsequent refresh took 2 minutes. This should only take a couple of seconds at most. Additionally, Airbyte claims it wrote 153MB of data, while the parquet files only total about ~10MB. Where is all that extra data going?
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.