Here is the log for the successful initial sync: success-logs.txt (7.0 MB)
Errors/warnings in the log that may be of interest:
JSON schema validation failed
Signalling close because record’s binlog file : mysql-bin-changelog.002503 , position : 75169383 is after target file : mysql-bin-changelog.002503 , target position : 75125447
The main thread is exiting while children non-daemon threads from a connector are still active. Ideally, this should not happen
Here is the log for the first failed sync: failure-logs.txt (5.3 MB)
Errors/warnings in the log that may be of interest:
Repeat of the errors/warnings found in successful.logs
Even with the warnings, the initial sync is successful. However, subsequent syncs are failing. I believe there is enough CPU and RAM for this sync. The error:
@natalyjazzviolin Thank you. I just wonder why it succeeds in the initial run.
Not sure if this helps but, I’ve also run another test with 9 tables that are a few Mb in size. The incremental loading syncs take much longer that the initial, historical load, which I find odd.
My cursor field should be the timemodified column. It is a timestamp and if you mean trailing zeroes for example 1647509600 then yes, fields may have trailing zeroes
Correct. I’ve selected the “Normalized tabular data”, not the “Raw data (JSON)”
I am not 100% sure though if the cursor field is timemodified as I am using CDC and don’t have the option of selecting the cursor field.
Interestingly, this link describes that the source does not necessarily need a suitable cursor field.
“On the other hand, CDC incremental replication reads a log of the changes that have been made to the source database and transmits these changes to the destination. Because changes are read from a transaction log when using CDC, it is not necessary for the source data to have a suitable cursor field.”
I’ve also tested the same connection on a bit older Airbyte 0.39.19-alpha instance with source MySql 0.5.11, destination Snowflake 0.4.28, and the incremental loading works.
The incremental syncs are still failing after upgrading to Airbyte 0.40.10. Connectors are the latest versions.
The initial historical sync also succeeded after failing once.
Here is the historical sync log that failed: (/tmp/workspace/46/0) historical-0-fail.log (4.1 MB)
Here is the historical sync log that succeeded: (/tmp/workspace/46/1) historical-1-success.log (7.6 MB)
Note: I’ve removed some “Records read” logs to reduce size.
Here is the incremental sync log that fails: failure-logs.log (5.0 MB)
This looks like it is failing because of the StandardSyncInput which contains the full catalog and the state. It is hard to estimate what will be the size for this specific connection. The state is only added on the second sync, which is why the initial sync works.
Could you try a few things? These could be fixes in the short term, but we will work on making a longterm fix.
Split the tables into 2 or more connections so as not to hit the maximum message size.
Tweak the dynamic.config and build temporal with a bigger BlobSize
I have about 65 tables of interest that are about 21 Gb in size.
To split the tables into 2 or more connections, I have selected only a few tables that add up to 1.74 Gb, which I believe is small, yet the incremental sync keeps failing.
I have a critical critical issue. The connections are passing but there is a huge data integrity issue.
The counts of the records are significantly off. Please could you and the team look into this urgently.
Thanks @alexnikitchuk I’ve updated to the latest version and I no longer get that error
@natalyjazzviolin I can’t seem to replicate the previous issue, but after updating to the latest version the incremental syncs still aren’t fully reliable
I’ve found an issue for the latest error you’re encountering. Looks like it happens sporadically just like in your case, and the only current fix is to reset the data and start a new sync, unfortunately. https://github.com/airbytehq/airbyte/issues/17372
My suggestion would be to update the MySQL source to 1 or higher! As the user in the issue is also on a 6.x version.
We stick to one issue per forum post for documentation purposes. If you’d like to discuss this further, please start a new thread!