Optimizing initial data load performance in Airbyte OSS from MySQL to Postgres


hi all,
i’m trying to determine if the performance we are seeing is typical or if i need to investigate.
we have setup airbyte oss running on a ec2 instance with 16vcpu / 128GB, with the intention of sync data from mysql to postgres.
have setup some initial tests using incremental append + dedupe. We are not using cdc.
the issue we are seeing is after the initial load of data from mysql in to the airbyte_internal schema’s raw tables, airbyte then tries to insert-select the data, expanding _airbyte_data jsonb into the target schema table, in one single enermous transaction. The issue is that at first load, for a table with 8m rows, the insert has been running for over 2h30m.
I have several dozen tables to sync, some with 80m+ rows.
How can i optimze the inital load for speed?

