Airbyte is bringing the same rows again despite having the timestamp of the last retrieval. The issue seems to be related to the cursor field ‘updated_at’ and the incremental append sync. The user is looking for a solution to fix this problem.


Hello everyone,

I’m using airbyte with Snowflake as Source and Postgres as Destination.
I’m using “updated_at” as cursor field, and I use “Incremental | Append” sync.

For some reason the airbyte brings the same rows even though it knows it has timestamp it already retrieved:

airbyte_emitted_at updated_at
2024-02-26 00:52:50.615000 +00:00,2024-02-26T00:12:02.700
2024-02-26 00:52:50.615000 +00:00,2024-02-26T00:12:02.700
2024-02-26 00:52:50.615000 +00:00,2024-02-26T00:12:02.700
2024-02-26 00:20:07.308000 +00:00,2024-02-26T00:12:02.700
2024-02-26 00:20:07.308000 +00:00,2024-02-26T00:12:02.700
2024-02-26 00:20:07.308000 +00:00,2024-02-26T00:12:02.700

As you can see in this example, it brought the same rows again and if I go to the sync logs I can see it already “knew” that this is the timestamp of the last retrieval:

2024-02-26 00:52:50 source > INFO i.a.i.s.r.s.CursorManager(createCursorInfoForStream):169 Found matching cursor in state. Stream: MY_TABLE. Cursor Field: UPDATED_AT Value: 2024-02-26T00:12:02.700 Count: 2051

What can cause this issue and how can I fix this?

Thank you!

If it helps I also get this schema validation error (which shouldn’t effect anything):
Error messages: [$.DATE_: 2023-08-29T17:33:23 is an invalid date-time]

And when reading the data I get this count by cursor:
2024-02-26 00:55:18 source > INFO i.a.i.s.j.AbstractJdbcSource(lambda$queryTableIncremental$18):337 Table MY_TABLE cursor count: expected 2051, actual 2534

But when I look at the actual count of records it brought I get:
2024-02-26 00:55:24 destination > INFO i.a.i.d.r.InMemoryRecordBufferingStrategy(flushAllBuffers):85 Flushing MY_TABLE: 4432 records (24 MB)

What is going on?

<@U042JC23FHU> Did you also try using incremental_deduped ?