Error with CDC in MS SQL Server to Snowflake data integration

Summary

Error related to saved offset not present on the server in CDC process. Need to understand how to mitigate this error and handle full refresh in case of database restart or failover. Question about forcing snapshot creation and clarification on the server mentioned in the error message.


Question

Hi, I am extracting data from MS SQL Server database using cdc and loading into snowflake. Every now and then, the job runs into this error. Post this, I need to reset the data and trigger the load from scratch.
io.airbyte.commons.exceptions.ConfigErrorException: Saved offset no longer present on the server. Please reset the connection, and then increase binlog retention and/or increase sync frequency.

I’m new to airbyte and trying to understand how cdc works. I could make out that this has to do with the last offset saved on sql server, which airbyte maintains as a checkpoint to start. However, isn’t it easy to lose that lsn, whenever the database is restarted or a failover happens? How can we mitigate this error/full refresh in that scenario? Is it possible to force snapshot creation after the initial snapshot, so that this lsn is updated to a recent value? I would also like to understand if the server referred to in the error message, is it the db server or airbyte server. Thanks!



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

["cdc", "ms-sql-server", "snowflake", "saved-offset", "binlog-retention", "sync-frequency", "snapshot-creation", "failover"]

this is the offset info from the connection state :
"mssql_cdc_offset": {
"[\"ZoomSTG\",{\"server\":\"ZoomSTG\",\"database\":\"ZOOMSTG\"}]": "{\"transaction_id\":null,\"event_serial_no\":1,\"commit_lsn\":\"0017c0b6:0002465a:0013\",\"change_lsn\":\"0017c0b6:0002465a:0010\"}",
"[\"ZoomSTG\",{\"server\":\"ZoomSTG\",\"database\":\"ZoomSTG\"}]": "{\"commit_lsn\":\"0017c0b3:0009d2f1:0001\",\"snapshot\":true,\"snapshot_completed\":true}"
},

The snapshot commit_lsn 0017c0b3:0009d2f1:0001 is the value noted from a full refresh. This value doesn’t change until I hit the 10 day cdc retention mark in sql server and then it fails with the error that the saved offset is no longer present on the server. The other commit_lsn from incremental changes is updated and data is pulled accurately for 10 days. Is this correct behaviour?

I also noticed this error in the logs, even when the sync succeeds and pulls incremental data successfully
source > ERROR debezium-sqlserverconnector-change-event-source-coordinator i.d.p.ErrorHandler(setProducerThrowable):52 Producer failure io.debezium.DebeziumException: java.util.concurrent.ExecutionException: java.lang.InterruptedException: Interrupted while snapshotting table student The logs also show that the debezium engine was stopped for some reason before hitting this error.

The ``snapshot_completed":true}` is true in the offset file. Does it mean that the snapshot was completed successfully after failing initially. Or the snapshot offset not changing is a result of the above error that the snapshotting was interrupted?

Any pointers in this regard are appreciated. Thanks! <@U01MMSDJGC9> <@U01AB6V6NMQ>