- Is this your first time deploying Airbyte?:Yes
- OS Version / Instance: Ubuntu
- Memory / Disk: 16Gb / 300 Gb
- Deployment: Kubernetes
- Airbyte Version: 0.40.17
- Source name/version: Postgres 1.0.22
- Destination name/version: Postgres 0.3.26
- Step: sync
- Description:
I have a table in a database with 40465468 entries. I do its replication by incremental deduplication. The synchronization itself is going well. but then it starts normalization all records which lasts more than 10 hours and 10 hours by _scd table. How to make it so that the normalization is also an increment? Or how can it be accelerated?
Hi @zebesh, here are our docs on scaling Airbyte:https://airbytehq.github.io/operator-guides/scaling-airbyte/
Currently we are focusing on stability as opposed to speed. For a table that big you could try allocating more space/memory as described in the table above, but apart from that there are currently no workarounds, unfortunately.
It turns out that non-incremental normalization is normal behavior? And what exactly should be added to memory in order to at least somehow speed up normalization?
Yes, that is correct. I would advise tweaking the env
variables as stated in this section and setting higher values to provide more memory:
https://airbytehq.github.io/operator-guides/scaling-airbyte/#memory