- Is this your first time deploying Airbyte?: Yes
- OS Version / Instance: Amazon linux 2
- Memory / Disk: you can use something like 16Gb / 100GB
- Deployment: Docker
- Airbyte Version: * * 0.44.5
- Source name/version: Mysql 8.
- Destination name/version: Redshift
- Step: The issue is happening during sync
We are trying to sync 2 TB of data (full+cdc) from MySQL to Redshift. We are running Airbyte on EC2 m5.xlarge (4vCPU, 16GB).
The performance is relatively low; we are ingesting around 5GB in 10 minutes, in a rough calculation, which means around 68 hours for 2TB.
Our goal is to minimize the full refresh times by at least half(or even more).
We can see that the MySQL connector is fetching 5000 rows per fetch(attached screenshot 1).
We are trying to figure out if this is something configurable or a derivative of the memory allocated to the worker container.
We have tried to increase the request & limit both CPU and memory (attach screenshot2), but the 5000 rows fetch stays as is.
We encountered the following:
But still, we couldn’t figure out if this is a limitation of the source connector or something else (configuration).
Any advice will be much appreciated.