Summary
The user is experiencing slow performance with Incremental Sync compared to Full Refresh when replicating data from Postgres to Clickhouse using Airbyte. They have ~16 million records in a partitioned table and noticed that Incremental Syncs are taking more time to process much fewer records than the initial Full refresh.
Question
Hi everyone!
I’m trying to use Airbyte to replicate data from Postgres (AWS RDS) to Clickhouse (self-hosted on k8s so far). I’m using CDC method. For the first run I used Full refresh | OVerwrite
and then changed to Incremental | Append
. I realised that almost all Incremental Appends are running terribly slow compared to the initial Full refresh? Why is that? I have ~16mln records in the table (partition table). Initial Full refresh took 14m23s. Then, I scheduled Incremental sync every two hours and you can find it took in most cases even more time to process much less records
Thanks for any help!
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.