Improving Airbyte Replication Performance

Summary

User is investigating performance improvements for Airbyte replications by modifying the thread cap in the Airbyte CDK for Java-based connectors. They increased the thread limit from 2 to 5, resulting in faster replications, and are seeking insights on the implications of this change and the rationale behind the default setting.


Question

Hi Airbyte community! I have been debugging how can the performance of the Airbyte replications be improved and I stumble across the following override on the Airbyte CDK for java-based connectors: https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/java/airbyte-cdk/db-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/jdbc/JdbcBufferedConsumerFactory.kt#L102|https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/java/airbyte-cdk/db-d[…]dk/integrations/destination/jdbc/JdbcBufferedConsumerFactory.kt

Essentially, when creating the amount of threads to write into the destination, they are capped into 2. I tested increasing this value to 5 and I could see the amount of thread workers increasing and likewise the replications happening faster as consequence.

I was wondering though if there are any implications for this or how it was decided to set it to 2 and not higher, or why not making it a configurable parameter.

Has anyone an idea for this or any insights? :slightly_smiling_face:



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

['airbyte', 'replication-performance', 'java-cdk', 'thread-cap', 'jdbc']