Determining CDC table in MSSQL for each stream in Airbyte


This user is inquiring about how each stream in Airbyte determines which CDC table in MSSQL to use, especially when there are schema changes and multiple capture instances for the same table.


Hello all :wave:

We are successfully using Airbyte to send data from MSSQL to Big Query. So firstly, thank you very much!

We are using CDC (Incremental | Append + Deduped), and have set up one connection per table.

The question I have is: How does each stream determine which CDC table in MSSQL to use?

The reason I ask is when the schema changes we “reset” the data on the Airbyte side, but we also “reset” our CDC on the MSSQL side.

This leads to potentially having multiple/varied capture_instances for each table. If I run SELECT * FROM cdc.change_tables against our DB I get (for example):

Because table3 has had a schema change (and so we deleted and replaced the old CDC table to include the new column).

This connection continues to work correctly after a reset & re-sync, but how does it know to use the new _V2 table?

What would happen if I didn't delete the old table and I had multiple capture instances for the same table (for example):
I have looked through the source code but can't find anything, the closest is the <|assertCdcSchemaQueryable>, but it doesn't actually answer my question.

Thanks in advance for any answers or direction you can offer.



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. [Click here]( if you want to access the original thread.

[Join the conversation on Slack](

["cdc-table", "mssql", "airbyte", "schema-change", "capture-instances"]