Summary
Airbyte is bringing the same rows again despite having the timestamp of the last retrieval. The issue seems to be related to the cursor field ‘updated_at’ and the incremental append sync. The user is looking for a solution to fix this problem.
Question
Hello everyone,
I’m using airbyte with Snowflake as Source and Postgres as Destination.
I’m using “updated_at” as cursor field, and I use “Incremental | Append” sync.
For some reason the airbyte brings the same rows even though it knows it has timestamp it already retrieved:
airbyte_emitted_at updated_at
2024-02-26 00:52:50.615000 +00:00,2024-02-26T00:12:02.700
2024-02-26 00:52:50.615000 +00:00,2024-02-26T00:12:02.700
2024-02-26 00:52:50.615000 +00:00,2024-02-26T00:12:02.700
2024-02-26 00:20:07.308000 +00:00,2024-02-26T00:12:02.700
2024-02-26 00:20:07.308000 +00:00,2024-02-26T00:12:02.700
2024-02-26 00:20:07.308000 +00:00,2024-02-26T00:12:02.700
As you can see in this example, it brought the same rows again and if I go to the sync logs I can see it already “knew” that this is the timestamp of the last retrieval:
2024-02-26 00:52:50 source > INFO i.a.i.s.r.s.CursorManager(createCursorInfoForStream):169 Found matching cursor in state. Stream: MY_TABLE. Cursor Field: UPDATED_AT Value: 2024-02-26T00:12:02.700 Count: 2051
What can cause this issue and how can I fix this?
Thank you!
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want to access the original thread.