Summary
User reports issues with the MySQL CDC connector failing during initial snapshot creation after syncing 500 GB of data. They inquire about the duration of the job, resumability after failure, and how to resolve a specific error related to binary log files.
Question
Hi Team,
We are using airbyte new mysql cdc connector to sync around 500 GB data from the source to s3.
After initial sync of 500 GB , it starts creation initial snapshot (reading CDC bin logs) but after some time it fails with error
2024-03-06 16:22:00 source > io.debezium.DebeziumException: Could not find first log file name in binary log index file Error code: 1236; SQLSTATE: HY000.
2024-03-06 16:22:00 source > at io.debezium.connector.mysql.MySqlStreamingChangeEventSource.wrap(MySqlStreamingChangeEventSource.java:1254) ~[debezium-connector-mysql-2.4.0.Final.jar:2.4.0.Final]
2024-03-06 16:22:00 source > at io.debezium.connector.mysql.MySqlStreamingChangeEventSource$ReaderThreadLifecycleListener.onCommunicationFailure(MySqlStreamingChangeEventSource.java:1299) ~[debezium-connector-mysql-2.4.0.Final.jar:2.4.0.Final]
2024-03-06 16:22:00 source > at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:1079) ~[mysql-binlog-connector-java-0.28.1.jar:0.28.1]
2024-03-06 16:22:00 source > at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:631) ~[mysql-binlog-connector-java-0.28.1.jar:0.28.1]
2024-03-06 16:22:00 source > at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:932) ~[mysql-binlog-connector-java-0.28.1.jar:0.28.1]
2024-03-06 16:22:00 source > at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
2024-03-06 16:22:00 source > Caused by: com.github.shyiko.mysql.binlog.network.ServerException: Could not find first log file name in binary log index file
2024-03-06 16:22:00 source > at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:1043) ~[mysql-binlog-connector-java-0.28.1.jar:0.28.1]
2024-03-06 16:22:00 source > ... 3 more```
We have couple of question below:
1. Is it normal for this king of job to run for 24 hrs (mentioning we are fetching only a single large table)
(if yes, is there any way we can speed it up)
2. This job after failure when to attempt 2 which started again with initial sync (i guess it should be resumable from the offset)
3. What about the above error( i was facing it on smaller tables as well on daily basis), how to resolve it??
Some info
``` log_bin | ON
sql_log_bin | ON
binlog_format | ROW
binlog_row_image | FULL
binlog_expire_logs_seconds | 2592000```
Any help would be much appreciated
Thanks.
<br>
---
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. [Click here](https://airbytehq.slack.com/archives/C021JANJ6TY/p1709797718905729) if you want
to access the original thread.
[Join the conversation on Slack](https://slack.airbyte.com)
<sub>
['mysql-cdc-connector', 'data-sync', 'binary-log', 'error-handling', 'debezium']
</sub>