Improving Airbyte replication performance by adjusting thread count in CDK for Java-based connectors

Summary

The user is inquiring about the implications and decision-making process behind the default thread count limit of 2 in the Airbyte CDK for Java-based connectors, and whether it can be made a configurable parameter.


Question

Hi Airbyte community! I have been debugging how can the performance of the Airbyte replications be improved and I stumble across the following override on the Airbyte CDK for java-based connectors: https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/java/airbyte-cdk/db-destinations/src/main/kotlin/io/airbyte/cdk/integrations/destination/jdbc/JdbcBufferedConsumerFactory.kt#L102|https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/java/airbyte-cdk/db-d[…]dk/integrations/destination/jdbc/JdbcBufferedConsumerFactory.kt

Essentially, when creating the amount of threads to write into the destination, they are capped into 2. I tested increasing this value to 5 and I could see the amount of thread workers increasing and likewise the replications happening faster as consequence.

I was wondering though if there are any implications for this or how it was decided to set it to 2 and not higher, or why not making it a configurable parameter.

Has anyone an idea for this or any insights? :slightly_smiling_face:



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

["airbyte", "replication", "performance", "cdk", "java-based-connectors", "thread-count", "jdbc-buffered-consumer-factory"]