Manage multiple sources sync to the same tables

Hi Airbyte team!
A few conceptual questions :slightly_smiling_face: basically, I’m trying to send data from N sources to the same destination, in the same namespace and tables (all of my sources have the same format, it’s just the data inside that’s different, and I’m trying to gather everything in the same place).
This is what I understand, but I kinda need guidance…

  1. Overwrite can’t work, because if I use it, the last sync will erase all the previous ones, so instead of having data from my N sources I’ll only have one.
  2. Incremental, with a “created_at” cursor field might not work either, right? Because all of my sources will use the same cursor field in the destination, but if I have source A with data at t=1 and 3, and then source B with data at t=2, if A syncs before B I’ll lose B’s data?
  3. Incremental, with a “id” cursor field might work though?

Would gladly take any input you might have on this one :slightly_smiling_face: (otherwise I’ll do copies to N destinations and just use dbt to merge everything together, but I’d rather avoid it to keep my database as clean and simple as possible)

Thank you for this question @alexchouraki
Approach 2 can work because states are managed at the connection level, not at the destination level: two connections using the same destination will have a different state.

But solution 3 will probably be the cleanest, you could use a custom DBT transformation to inject a <name> as data_source, column when union-ing them together and track from which source a row comes from.
Note you can use the prefix on your tables so you dont need N schemas or N destinations if you go with solution 3

(Thank you Chris for the suggestion!)

1 Like