Has the normalization process been updated? Unless I’ve done something wrong - which I can’t imagine what as this is a brand new source connected and I’m using the most recent versions - there really needs to be much better regression testing going on.
From the logs it appears to be failing before normalisation. Is it possible that this is caused by a timeout of 120 seconds in the source configuration instead of the default 300 seconds?
Hello there! You are receiving this message because none of your fellow community members has stepped in to respond to your topic post. (If you are a community member and you are reading this response, feel free to jump in if you have the answer!) As a result, the Community Assistance Team has been made aware of this topic and will be investigating and responding as quickly as possible.
Some important considerations that will help your to get your issue solved faster:
- It is best to use our topic creation template; if you haven’t yet, we recommend posting a followup with the requested information. With that information the team will be able to more quickly search for similar issues with connectors and the platform and troubleshoot more quickly your specific question or problem.
- Make sure to upload the complete log file; a common investigation roadblock is that sometimes the error for the issue happens well before the problem is surfaced to the user, and so having the tail of the log is less useful than having the whole log to scan through.
- Be as descriptive and specific as possible; when investigating it is extremely valuable to know what steps were taken to encounter the issue, what version of connector / platform / Java / Python / docker / k8s was used, etc. The more context supplied, the quicker the investigation can start on your topic and the faster we can drive towards an answer.
- We in the Community Assistance Team are glad you’ve made yourself part of our community, and we’ll do our best to answer your questions and resolve the problems as quickly as possible. Expect to hear from a specific team member as soon as possible.
Thank you for your time and attention.
The Community Assistance Team
I was looking through past threads and see that you’ve had a similar issue before:
Is this also a CDC sync?
I also found this article in the Snowflake knowledge base:
I’m looking more into this and getting input from the team, hope to have some ideas for you soon! Thanks for you patience
@natalyjazzviolin thank you so much for your prompt reply and for sharing that previous issue - I had completely forgotten about that one.
Re-reviewing the logs it appears there’s something wrong with the
source. And yes it is CDC.
I’m not sure how your snowflake community post relates to this issue as - from the logs - it appears to be an issue with the source.
Updates: I just tried the connection with 1 table instead of 4 and it succeeded. So that is 3M records (2.8gb) instead of 85M records (50gb).
It appears that MySQL CDC is unstable for large syncs.
@natalyjazzviolin actually, honestly thank you again for sharing that previous issue. This does seem to be a timeout issue with the source.
I broke the connector into separate connectors (one for each stream) and one still failed. It seems to hang until the timeout limit specified in the source configuration.
look at the time signatures
2022-10-25 10:48:36 source > Stopping the embedded engine 2022-10-25 10:48:36 source > Waiting for PT5M for connector to stop 2022-10-25 10:50:35 source > Oct 25, 2022 10:50:35 AM com.github.shyiko.mysql.binlog.BinaryLogClient$5 run 2022-10-25 10:50:35 source > INFO: Keepalive: Trying to restore lost connection to ... 2022-10-25 10:53:36 source > Stopping the task and engine 2022-10-25 10:53:36 source > Stopping down connector 2022-10-25 10:55:06 source > Coordinator didn't stop in the expected time, shutting down executor now 2022-10-25 10:56:36 source > Connection gracefully closed
I created one connection with 4 tables and it fails.
I created 4 separate connections and 3 out of 4 pass and one fails.
6.29 GB20,834,086 emitted recordsno records21m 50s
3.81 GB10,427,393 emitted records10,427,393 committed records12m 29s
2.8 GB3,890,211 emitted records3,890,211 committed records6m 11s
38.47 GB50,160,338 emitted records50,160,338 committed records1h 6m 36s
It is weird that the middle one failed i.e. not the largest
I have tried extending the timeout on the source configuration but no luck (300, 600 and 900s)
So glad the past thread was helpful! I see the GitHub comment and all your follow ups here are super helpful. I’m writing to the engineering team and hope to hear from them soon!
The destination config (snowflake credentials) were updated and the sync started working. but after a few incremental + dedupe syncs its started failing again. There are no failed queries in snowflake. It appears to be an issue with the source (or debezium).
Can someone please investigate.
To be clear: I have one MySQL source with 4 tables. Using that one connector I can sync all tables except for one. I have tried syncing one table, it successfully syncs, then i unselected it and select the other one and it fails. The connections are all the same, there are no existing SCD tables or anything.
The schema does not seem problematic.
The various tables are roughly the same size.
I can’t seem to understand what is going wrong from the logs.
Can someone pleas assist
I think it is related to debezium
Coordinator didn't stop in the expected time, shutting down executor now
that line doesn’t appear in successful syncs
After investigating the mysql RDS logs - it appears to be an issue with connecting to RDS in the VPC
IP address 'X.X.X.X' could not be resolved
Following stackoverflow MySQL warning "IP address could not be resolved" - Server Fault
Updates: that seems to be unrelated. After a few successful syncs it’s now randomly failing every subsequent incremental sync
I wonder if this issue is related
full-refresh overwrite works fine. its the incremental+dedupe that fails
@danieldiamond were you able to isolate this any further? We’ve been hitting issues on a few postgres sources where debezium fails to shut down correctly on syncs. Sometimes, it’ll work after a reset. Other times, no luck.
cdc: fix debezium shutdown interruption bug by subodh1810 · Pull Request #24166 · airbytehq/airbyte · GitHub will likely resolve the original issue