Summary
User is experiencing issues with Airbyte while syncing data from Postgres to Snowflake, specifically facing duplicate IDs in Snowflake due to primary key constraints. The user seeks guidance on overriding the source-defined primary key for incremental loads and maintaining data integrity without replicating the source’s partitioning scheme.
Question
I’m facing an Issue with Postgres to Snowflake sync via Airbyte
Source: Postgres table with (ID, IS_LIVE) as primary key (used for partitioning)
Target: Want only ID as primary key in Snowflake and data loaded with distinct IDs
Current problems: - Getting duplicate IDs in Snowflake despite PK constraint - Can’t override primary key in Airbyte UI/config (error: “Primary key is already pre-defined”) - Source enforces [ID, IS_LIVE] as PK
How can I:
- Override source-defined primary key to use just ID in each incremental load? (full refresh works for distinct ID , but “incremental append + deduped” causes duplicate IDs)
- Maintain data integrity without replicating Postgres’ partitioning scheme?
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.