Failing to ingest a "big" MySQL table (38Gb)

Mehdi-ornikar · May 24, 2022, 3:57pm

Is this your first time deploying Airbyte?: No
Memory / Disk / CPU: you can use something like 16Gb / 25 Gb and 6 vCPU
Deployment: Docker on GCP VM
Airbyte Version: 0.38.4-alpha
Source name/version: MySQL 0.5.11
Destination name/version: BigQuery 1.1.6
Step: The issue is happening during sync
Description: Remove this with the description of your problem.

Hello everyone,

So I’m trying to ingest just one table of a database which is the biggest one with 150 million lines and 38Gb size.

The sync is going well for about an hour after ingesting 25 million records but after that the sync fails with the following error:

2022-05-22 15:53:35 ERROR i.a.w.DefaultReplicationWorker(run):174 - Sync worker failed.
java.util.concurrent.ExecutionException: io.airbyte.workers.DefaultReplicationWorker$SourceException: Source cannot be stopped!

Have you any advices for handling this amount of data ?

This is not the first time I’m struggling with ingestion of table Gb size postgres and mysql table and I wonder if Airbyte is really able to load ‘big’ tables (and I’m not talking about Big Data) or maybe my set up is not suitable to handle this.

Thanks a lot for your help !

logs-4987.txt (2.2 MB)

marcosmarxm · May 24, 2022, 11:01pm

Sorry to hear that Mehdi. I saw you’re using CDC to sync data. As a test, would be possible to disable CDC and use standard method with full refresh. This would help us understand if the problem is with CDC itself or the table size.

Mehdi-ornikar · May 25, 2022, 8:32am

Hi Marco, thanks for answering.

I’ve switched to standard method, erased the target dataset, and launched a new sync.

But the job only last 43s with the “Source cannot be stopped” error…
logs-5139.txt (74.1 KB)

Mehdi-ornikar · May 25, 2022, 11:44am

Update: I’ve created a new connection. For now it’s still processing after 40 million rows ingested. Finger crossed.

Mehdi-ornikar · May 25, 2022, 1:29pm

The job has failed again after 3h49 and almost 68 million lines ingested.

Here are the logs (only the las 20k lines) for investigation:
tail_20k_logs-5146.txt (2.9 MB)

Thanks

marcosmarxm · May 25, 2022, 10:06pm

From your latest logs:

2022-05-25 13:00:38 e[32mINFOe[m i.a.w.DefaultReplicationWorker(lambda$getReplicationRunnable$6):331 - Records read: 67996000 (14 GB)2022-05-25 13:00:38 e[44msourcee[0m > Exception in thread "main" java.lang.RuntimeException: java.sql.SQLException: Can not read response from server. Expected to read 4 bytes, read 0 bytes before connection was unexpectedly lost.2022-05-25 13:00:38 e[44msourcee[0m > at io.airbyte.db.jdbc.StreamingJdbcDatabase$1.tryAdvance(StreamingJdbcDatabase.java:100)2022-05-25 13:00:38 e[44msourcee[0m > at java.base/java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)2022-05-25 13:00:38 e[44msourcee[0m > at io.airbyte.commons.util.DefaultAutoCloseableIterator.computeNext(DefaultAutoCloseableIterator.java:38)2022-05-25 13:00:38 e[44msourcee[0m > at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:146)2022-05-25 13:00:38 e[44msourcee[0m > at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:141)2022-05-25 13:00:38 e[44msourcee[0m > at io.airbyte.commons.util.LazyAutoCloseableIterator.computeNext(LazyAutoCloseableIterator.java:42)2022-05-25 13:00:38 e[44msourcee[0m > at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:146)2022-05-25 13:00:38 e[44msourcee[0m >

Looks the connection was timeout. Are you using MySQL in AWS Aurora? Can you check the timeout in your db configuration?

Mehdi-ornikar · May 27, 2022, 7:46am

It is a MySQL instance from Google Cloud SQL but this could also be a timeout issue. I’ll check and I’ll let you know.

Mehdi-ornikar · May 31, 2022, 12:17pm

I still don’t have news about the timeout. I’ve figured out that the Airbyte job might be blocked by the cloud team as they was worried about the high volume requested. I’ll let you know as soon as I get more information.

marcosmarxm · May 31, 2022, 7:04pm

Ok Mehdi, let me know when you have any updates here.

danieldiamond · June 6, 2022, 8:19pm

we’re experiencing the same issue with large table from MySQL to Snowflake. Any upates?

danieldiamond · June 6, 2022, 8:49pm

@marcosmarxm this is most likely a CDC issue. I don’t think any airbyte user has ever been able to successfully sync a large table from MySQL using CDC. I know previously the issue of hanging workers was resolved but now that is leading to the above issue.

Airbyte Version: 0.39.10-alpha
Source: MySQL (0.5.11)
Destination: Snowflake (0.4.28)

57.77 GB | 127,417,153 emitted records | no records | 5h 0m 42s

danieldiamond · June 7, 2022, 1:46am

added my logs here Failing to ingest a "big" MySQL table

Mehdi-ornikar · June 7, 2022, 2:36pm

Hello everyone. After talking with the team in charge of the databases it seems there is no timeout set.

I’ve tried a new ingestion few days ago. It went further than the last launch but failed again after 102 million rows (21Gb).

This time the error seems different

2022-06-04 13:34:32 ERROR i.a.w.DefaultReplicationWorker(run):174 - Sync worker failed.

java.util.concurrent.ExecutionException: io.airbyte.workers.DefaultReplicationWorker$DestinationException: Destination process message delivery failed

Here are the last 1000 lines of logs

tail_1k_logs-5553.txt (136.1 KB)

marcosmarxm · June 7, 2022, 10:03pm

Mehdi: from your logs the error Suppressed: io.airbyte.workers.WorkerException: Source process exit with code 137. This warning is normal if the job was cancelled.
Which means your instance is hitting OOM, please give more memory/cpu to your Airbyte instance.

Mehdi-ornikar · June 8, 2022, 1:28pm

Hello Marcos.

My VM has currently 16Gb RAM, should I scale it up to 32Gb ?

And do you have some documentation or advices for the the setting a correct sizing for the VM hosting Airbyte ?

Wasted_Fury · June 8, 2022, 10:03pm

Could some of these initial snapshot issues be down to the parameters in use for the debezium settings for the following config variables:

offset.flush.timeout.ms
offset.flush.interval.ms
max.request.size

max.queue.size
max.batch.size

Not sure how or where these are being set in Airbytes use of Debezium but if default values are being used that might be the issue as the defaults aren’t set for large snapshots in my understanding.

Ref: apache kafka - Debezium is failing to snapshot big table size - Stack Overflow

marcosmarxm · June 9, 2022, 1:31am

Yemi here is the variables for Debezium:
https://github.com/airbytehq/airbyte/blob/bb469418679e1605689d994a57cf59b1095c677d/airbyte-integrations/bases/debezium/src/main/java/io/airbyte/integrations/debezium/internals/DebeziumRecordPublisher.java#L122-L158

Any comments or information to help us is welcomed.

Wasted_Fury · June 9, 2022, 7:37am

Caveat: I’m no debezium expert.

The stackoverflow answer suggests:

offset.flush.timeout.ms=60000
offset.flush.interval.ms=10000
max.request.size=10485760

“Decreasing the offset flush interval allows Kafka connect to flush the offsets more frequently, and setting a large timeout gives it more time to get an acknowledgement for the commit.”

max.queue.size = 81290
max.batch.size = 20480

“The default size for queue is 8192, which is quite low for a larger database. Bumping these configurations up helped a lot.”

Another response suggests having the offset.flush.interval.ms > than offset.flush.timeout.ms to prevent an “Invalid call to OffsetStorageWriter flush() while already flushing…” error.

I don’t have time to test this with Airbyte at the moment, but would suggest anyone suffering issues with large snapshots rebuilds airbyte with the above changes and tests.

Debezium also offers a incremental snapshot that allows the specification of how many rows in each chunk, which might be worth investigating: Debezium connector for MySQL :: Debezium Documentation.

marcosmarxm · June 10, 2022, 12:08am

(schedule meeting with Daniel next week to test changing values in Debezium parameters)

danieldiamond · June 14, 2022, 4:47am

I published a custom mysql source connector with the updated debezium configuration.
It failed for a normalization issue. Posting issue here: MySQL to Snowflake CDC Sync Fails: Race condition, raw table not found but still being written to snowflake

Some interesting things to note:

This normalisation issue occurred last week when I was trying to get CDC full-refresh to work - so i dont believe its related to updated debezium spec (but it could be)
Experiencing almost an hour delay from completed source to next step

2022-06-14 03:12:10 e[44msourcee[0m > 2022-06-14 03:12:10 e[32mINFOe[m i.a.i.s.r.AbstractDbSource(lambda$read$2):132 - Closed database connection pool.
2022-06-14 03:12:10 e[44msourcee[0m > 2022-06-14 03:12:10 e[32mINFOe[m i.a.i.b.IntegrationRunner(runInternal):171 - Completed integration: io.airbyte.integrations.base.ssh.SshWrappedSource
2022-06-14 03:12:10 e[44msourcee[0m > 2022-06-14 03:12:10 e[32mINFOe[m i.a.i.s.m.MySqlSource(main):213 - completed source: class io.airbyte.integrations.source.mysql.MySqlSource
2022-06-14 04:22:10 e[44msourcee[0m > 2022-06-14 04:22:10 e[1;31mERRORe[m i.a.i.b.IntegrationRunner(lambda$watchForOrphanThreads$8):266 - Failed to interrupt children non-daemon threads, forcefully exiting NOW...

that Active non-daemon thread error (which i know isn’t an issue) doesn’t show up with my other connectors.

Topic		Replies	Views
MySQL 5.7 big table ingestion fails Connector Questions & Issues source-mysql , data-loading , connectors	5	1293	July 5, 2023
Source MySQL - stuck at : "Queueing query for table", "Preparing query for table" Connector Questions & Issues source-mysql , data-loading	3	347	March 7, 2023
Destination Bigquery sync process never starts Connector Questions & Issues source-mysql , destination-bigquery , data-loading , connectors	14	904	July 14, 2022
Failing to ingest a "big" MySQL table Connector Questions & Issues source-mysql , destination-snowflake , data-loading , connectors , cdc	6	583	July 14, 2022
MySQL to Snowflake Fails Normalization (Doesn't build RAW table) Connector Questions & Issues source-mysql , destination-snowflake , normalization	20	998	March 24, 2023

Failing to ingest a "big" MySQL table (38Gb)

Related topics