Understanding WAL growth despite clean logs and successful data replication in Airbyte

Summary

Explanation needed on why Write-Ahead Logging (WAL) continues to grow in Postgres despite successful data replication from Postgres to Snowflake via Airbyte with no errors in logs and all streams processed.


Question

Really hoping someone can help me here :slightly_smiling_face:

Can anyone explain how the WAL continues to grow despite Airbyte having a clean log and success?

Source: Postgres (running in AWS)
Destination: Snowflake
Tables: There are only 3 tables in the database publication/replication slot

There are no errors in the logs, all streams processed. I simply donโ€™t get it



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want to access the original thread.

Join the conversation on Slack

["wal-growth", "postgres", "aws", "snowflake", "data-replication", "airbyte"]

<@U039V95N98V> is your problem got solved ?

Arrrggghhh - I just noticed this buried in the logs:

Property 'flush.lsn.source' is set to 'false', the LSN will not be flushed to the database source and WAL logs will not be cleared. User is expected to handle this outside Debezium.

I have this set in the source connector, but itโ€™s clearly not behaving as it should

Can anyone confirm if this is the correct approach

  1. Create a heartbeat table, and grant insert/update to cdc_user
    create table if not exists cdc_keepalive(id int primary key, ts timestamp, pos pg_lsn);
    grant insert,update to public.cdc_keepalive to cdc_user;

  2. Add that table to the publication/replication slot

  3. Add the following to Airbyte config per source

heartbeat_action_query: "insert into public.cdc_keepalive values (1,now(),pg_current_wal_lsn()) on conflict (id) do update set ts=excluded.ts,pos=excluded.pos returning id,ts,pos;"
initial_waiting_seconds: 1200
invalid_cdc_cursor_position_behavior: "Fail sync"```
I'm still not entirely sure if this addresses the error because it doesn't match the error of `Property 'flush.lsn.source' is set to 'false'` - I assume that is adjusting `lsn_commit_behaviour: "While reading Data"`?