Postgres to Postgres: nonincremental normolize

  • Is this your first time deploying Airbyte?:Yes
  • OS Version / Instance: Ubuntu
  • Memory / Disk: 16Gb / 300 Gb
  • Deployment: Kubernetes
  • Airbyte Version: 0.40.17
  • Source name/version: Postgres 1.0.22
  • Destination name/version: Postgres 0.3.26
  • Step: sync
  • Description:
    I have a table in a database with 40465468 entries. I do its replication by incremental deduplication. The synchronization itself is going well. but then it starts normalization all records which lasts more than 10 hours and 10 hours by _scd table. How to make it so that the normalization is also an increment? Or how can it be accelerated?

Hi @zebesh, here are our docs on scaling Airbyte:https://airbytehq.github.io/operator-guides/scaling-airbyte/

Currently we are focusing on stability as opposed to speed. For a table that big you could try allocating more space/memory as described in the table above, but apart from that there are currently no workarounds, unfortunately.

It turns out that non-incremental normalization is normal behavior? And what exactly should be added to memory in order to at least somehow speed up normalization?

Yes, that is correct. I would advise tweaking the env variables as stated in this section and setting higher values to provide more memory:
https://airbytehq.github.io/operator-guides/scaling-airbyte/#memory