MySQL 5.7 big table ingestion fails

Hello everyone,

We are facing an issue when we try to ingest a big MySQL 5.7 table with CDC and Incremental | Append.

Here are versions we are using:

  • Airbyte Version: 0.40.3
  • Source: MySQL (0.6.9)
  • Destination: Google Cloud Storage (GCS) (0.2.11)
  • Airbyte deployed with helm chart version: 0.45.1

Our MySQL 5.7 instance is provided by GCP Cloud SQL.

Whatever big table / or destination we use (tried GCS / S3, with multiple output formats (Parquet, Avro, JSON)), it always fails at around ~20 minutes.

We tried giving way more hardware to Airbyte workers, and it always fails at the same timing (20 minutes).

Here is the stack trace we always have:

2022-09-07 09:51:49 [32mINFO[m i.a.w.g.DefaultReplicationWorker(lambda$getReplicationRunnable$6):339 - Records read: 6406000 (2 GB)
2022-09-07 09:51:49 [32mINFO[m i.a.w.g.DefaultReplicationWorker(lambda$getReplicationRunnable$6):339 - Records read: 6407000 (2 GB)
2022-09-07 09:51:49 [32mINFO[m i.a.w.g.DefaultReplicationWorker(lambda$getReplicationRunnable$6):339 - Records read: 6408000 (2 GB)
2022-09-07 09:51:50 [32mINFO[m i.a.w.g.DefaultReplicationWorker(lambda$getReplicationRunnable$6):339 - Records read: 6409000 (2 GB)
2022-09-07 09:51:50 [32mINFO[m i.a.w.g.DefaultReplicationWorker(lambda$getReplicationRunnable$6):339 - Records read: 6410000 (2 GB)
2022-09-07 09:51:50 [32mINFO[m i.a.w.g.DefaultReplicationWorker(lambda$getReplicationRunnable$6):339 - Records read: 6411000 (2 GB)
2022-09-07 09:51:50 [32mINFO[m i.a.w.g.DefaultReplicationWorker(lambda$getReplicationRunnable$6):339 - Records read: 6412000 (2 GB)
2022-09-07 09:51:52 [36mDEBUG[m i.a.c.h.LogClientSingleton(setJobMdc):139 - Setting kube job mdc
2022-09-07 09:52:02 [36mDEBUG[m i.a.c.h.LogClientSingleton(setJobMdc):139 - Setting kube job mdc
2022-09-07 09:51:29 [44msource[0m > 2022-09-07 09:51:28 [32mINFO[m i.d.r.RelationalSnapshotChangeEventSource(createDataEventsForTable):370 - 	 Exported 6318084 of 5194022 records for table 'service_order_measures.dimensions' after 00:19:50.082
2022-09-07 09:51:40 [44msource[0m > 2022-09-07 09:51:39 [32mINFO[m i.d.r.RelationalSnapshotChangeEventSource(createDataEventsForTable):370 - 	 Exported 6373380 of 5194022 records for table 'service_order_measures.dimensions' after 00:20:00.354
2022-09-07 09:51:47 [44msource[0m > 2022-09-07 09:51:46 [32mINFO[m i.d.p.s.AbstractSnapshotChangeEventSource(execute):88 - Snapshot - Final stage
2022-09-07 09:51:47 [44msource[0m > 2022-09-07 09:51:46 [1;31mERROR[m i.d.p.ErrorHandler(setProducerThrowable):35 - Producer failure
2022-09-07 09:51:47 [44msource[0m > io.debezium.DebeziumException: java.lang.RuntimeException: java.sql.SQLException: Streaming result set com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@29c051d9 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
2022-09-07 09:51:47 [44msource[0m > 	at io.debezium.pipeline.source.AbstractSnapshotChangeEventSource.execute(AbstractSnapshotChangeEventSource.java:85) ~[debezium-core-1.9.2.Final.jar:1.9.2.Final]
2022-09-07 09:51:47 [44msource[0m > 	at io.debezium.pipeline.ChangeEventSourceCoordinator.doSnapshot(ChangeEventSourceCoordinator.java:155) ~[debezium-core-1.9.2.Final.jar:1.9.2.Final]
2022-09-07 09:51:47 [44msource[0m > 	at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:137) ~[debezium-core-1.9.2.Final.jar:1.9.2.Final]
2022-09-07 09:51:47 [44msource[0m > 	at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:109) ~[debezium-core-1.9.2.Final.jar:1.9.2.Final]
2022-09-07 09:51:47 [44msource[0m > 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
2022-09-07 09:51:47 [44msource[0m > 	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
2022-09-07 09:51:47 [44msource[0m > 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
2022-09-07 09:51:47 [44msource[0m > 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
2022-09-07 09:51:47 [44msource[0m > 	at java.lang.Thread.run(Thread.java:833) [?:?]
2022-09-07 09:51:47 [44msource[0m > Caused by: java.lang.RuntimeException: java.sql.SQLException: Streaming result set com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@29c051d9 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
2022-09-07 09:51:47 [44msource[0m > 	at io.debezium.relational.RelationalSnapshotChangeEventSource.rollbackTransaction(RelationalSnapshotChangeEventSource.java:525) ~[debezium-core-1.9.2.Final.jar:1.9.2.Final]
2022-09-07 09:51:47 [44msource[0m > 	at io.debezium.relational.RelationalSnapshotChangeEventSource.doExecute(RelationalSnapshotChangeEventSource.java:143) ~[debezium-core-1.9.2.Final.jar:1.9.2.Final]
2022-09-07 09:51:47 [44msource[0m > 	at io.debezium.pipeline.source.AbstractSnapshotChangeEventSource.execute(AbstractSnapshotChangeEventSource.java:76) ~[debezium-core-1.9.2.Final.jar:1.9.2.Final]
2022-09-07 09:51:47 [44msource[0m > 	... 8 more
2022-09-07 09:51:47 [44msource[0m > Caused by: java.sql.SQLException: Streaming result set com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@29c051d9 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
2022-09-07 09:51:47 [44msource[0m > 	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:129) ~[mysql-connector-java-8.0.28.jar:8.0.28]
2022-09-07 09:51:47 [44msource[0m > 	at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122) ~[mysql-connector-java-8.0.28.jar:8.0.28]
2022-09-07 09:51:47 [44msource[0m > 	at com.mysql.cj.jdbc.ConnectionImpl.rollbackNoChecks(ConnectionImpl.java:1933) ~[mysql-connector-java-8.0.28.jar:8.0.28]
2022-09-07 09:51:47 [44msource[0m > 	at com.mysql.cj.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:1827) ~[mysql-connector-java-8.0.28.jar:8.0.28]
2022-09-07 09:51:47 [44msource[0m > 	at io.debezium.relational.RelationalSnapshotChangeEventSource.rollbackTransaction(RelationalSnapshotChangeEventSource.java:522) ~[debezium-core-1.9.2.Final.jar:1.9.2.Final]
2022-09-07 09:51:47 [44msource[0m > 	at io.debezium.relational.RelationalSnapshotChangeEventSource.doExecute(RelationalSnapshotChangeEventSource.java:143) ~[debezium-core-1.9.2.Final.jar:1.9.2.Final]
2022-09-07 09:51:47 [44msource[0m > 	at io.debezium.pipeline.source.AbstractSnapshotChangeEventSource.execute(AbstractSnapshotChangeEventSource.java:76) ~[debezium-core-1.9.2.Final.jar:1.9.2.Final]
2022-09-07 09:51:47 [44msource[0m > 	... 8 more
2022-09-07 09:51:48 [44msource[0m > 2022-09-07 09:51:47 [32mINFO[m i.d.e.EmbeddedEngine(run):846 - Stopping the task and engine
2022-09-07 09:51:48 [44msource[0m > 2022-09-07 09:51:47 [32mINFO[m i.d.c.c.BaseSourceTask(stop):238 - Stopping down connector
2022-09-07 09:51:48 [44msource[0m > 2022-09-07 09:51:47 [1;31mERROR[m i.d.c.m.MySqlConnectorTask(doStop):230 - Exception while closing JDBC connection
2022-09-07 09:51:48 [44msource[0m > java.sql.SQLException: Streaming result set com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@29c051d9 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
2022-09-07 09:51:48 [44msource[0m > 	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:129) ~[mysql-connector-java-8.0.28.jar:8.0.28]
2022-09-07 09:51:48 [44msource[0m > 	at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122) ~[mysql-connector-java-8.0.28.jar:8.0.28]
2022-09-07 09:51:48 [44msource[0m > 	at com.mysql.cj.jdbc.ConnectionImpl.rollbackNoChecks(ConnectionImpl.java:1933) ~[mysql-connector-java-8.0.28.jar:8.0.28]
2022-09-07 09:51:48 [44msource[0m > 	at com.mysql.cj.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:1827) ~[mysql-connector-java-8.0.28.jar:8.0.28]
2022-09-07 09:51:48 [44msource[0m > 	at com.mysql.cj.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:1695) ~[mysql-connector-java-8.0.28.jar:8.0.28]
2022-09-07 09:51:48 [44msource[0m > 	at com.mysql.cj.jdbc.ConnectionImpl.close(ConnectionImpl.java:714) ~[mysql-connector-java-8.0.28.jar:8.0.28]
2022-09-07 09:51:48 [44msource[0m > 	at io.debezium.jdbc.JdbcConnection.lambda$doClose$3(JdbcConnection.java:955) ~[debezium-core-1.9.2.Final.jar:1.9.2.Final]
2022-09-07 09:51:48 [44msource[0m > 	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
2022-09-07 09:51:48 [44msource[0m > 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
2022-09-07 09:51:48 [44msource[0m > 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
2022-09-07 09:51:48 [44msource[0m > 	at java.lang.Thread.run(Thread.java:833) [?:?]

We also checked that MySQL variables are well configured and they’re set to ~40 hours (interactive_timeout and wait_timeout ).

Do you have any idea what could be wrong with our set up or how could investigate furthermore please ?

Have a good day everyone !

Hello there! You are receiving this message because none of your fellow community members has stepped in to respond to your topic post. (If you are a community member and you are reading this response, feel free to jump in if you have the answer!) As a result, the Community Assistance Team has been made aware of this topic and will be investigating and responding as quickly as possible.
Some important considerations that will help your to get your issue solved faster:

  • It is best to use our topic creation template; if you haven’t yet, we recommend posting a followup with the requested information. With that information the team will be able to more quickly search for similar issues with connectors and the platform and troubleshoot more quickly your specific question or problem.
  • Make sure to upload the complete log file; a common investigation roadblock is that sometimes the error for the issue happens well before the problem is surfaced to the user, and so having the tail of the log is less useful than having the whole log to scan through.
  • Be as descriptive and specific as possible; when investigating it is extremely valuable to know what steps were taken to encounter the issue, what version of connector / platform / Java / Python / docker / k8s was used, etc. The more context supplied, the quicker the investigation can start on your topic and the faster we can drive towards an answer.
  • We in the Community Assistance Team are glad you’ve made yourself part of our community, and we’ll do our best to answer your questions and resolve the problems as quickly as possible. Expect to hear from a specific team member as soon as possible.

Thank you for your time and attention.
Best,
The Community Assistance Team

Hey could you share the complete log file of the sync here

This sounds like github issue #9784. I confirmed the issue occurs only when tables with millions of records are included.

May be we can continue the conversation there

Hi Team, Any update on this?