CDC not working as expected in Postgres source

Summary

CDC not capturing changes in Postgres source, loading whole data on sync despite no changes


Question

hi everyone
i have cdc enabled in the source side which is postgres
i followed this doc
https://docs.airbyte.com/integrations/sources/postgres
still every time sync runs its loading whole data
even though there is no change in the source side
attaching logs please help me i am struggling a lot what is the under lying cause

screen shots



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

["cdc", "postgres-source", "data-sync", "logs", "underlying-cause"]
2024-10-28 10:04:37  platform > SOURCE analytics [airbyte/source-postgres:3.6.22] | Type: db-sources-cdc-resync | Value: 1
2024-10-28 10:04:37  source > WARN main i.a.i.s.p.c.PostgresCdcCtidInitializer(cdcCtidIteratorsCombined):188 Saved offset is before Replication slot's confirmed_flush_lsn, Airbyte will trigger sync from scratch```

it detected invalid CDC position, so it re-synced data

how often do you run your synchronizations?

every hour i have set to test the incremental

<@U05JENRCF7C> how to correct this
cdc-cursor-invalid

you need to check or set higher values for max_slot_wal_keep_size, max_wal_size, wal_keep_size or increase frequency of synchronizations

i did set set the max_wal_size and the wal_keep_size
set to 1 GB

<@U05JENRCF7C> is it no enough

and i did not made any changes to the source table to check that my incremental should result in 0 records inserted

<@U05JENRCF7C> i have set 2gb for all the field mentioned by you still same

<@U05JENRCF7C> do you want me to create issue ?

Don’t ask me about creating an issue. I don’t work for Airbyte, I contribute to Slack community :slightly_smiling_face:
Nonetheless it might be a good idea

I am having this same issue and haven’t been able to find a solution. Here is a https://airbytehq.slack.com/archives/C01AHCD885S/p1729953140030949|link to my thread. I don’t believe this issue is due to the wal_keep_size being too small (I currently have it set to 1GB). I have a sample source and destination database with one table that is only 64K which the wal_keep_size should easily handle. The status of the replication slot is inactive (I repeatably queried it while my sync was running). It appears on my end that Airbyte might not be subscribing to the replication slot properly, not sure though. <@U07TB9MNW1Z> created a ticket for this issue and posted a link to it there. In this log, public.table_1 is set to incremental, there are others that are set to full.

Here is the https://github.com/airbytehq/airbyte/issues/47547|link to the issue <@U07TB9MNW1Z> created to track this.

I rebuilt my Airbyte installation, source, destination, and connection. I dropped and recreated my PG publication and replication slot. I granted access to 1 table table_1 to the database role I am using for the connection.

table_1
- There are 6255 records in the table
- The table size is 2Mb

results
- The replication slot is now active when the sync runs.
- The restart_lsn is now advancing as expected.
- The sync is still doing a FULL OVERWRITE of the data instead of an incremental append as expected.

I attached the logs from my latest run. They should be cleaner then the earlier log I posted.