Optimizing data sync performance in Airbyte for API based transformers like Stripe

Summary

When syncing data with API based transformers like Stripe in Airbyte, the process slows down as more data and streams are involved. Is there a setting to optimize data turnover speed or prevent endless sync? Concerns about accumulating logs affecting performance and continuous polling of already synced streams.


Question

I’m hoping someone on the Airbyte or Community side can answer this. When we hook up an API based transformer like Stripe and start syncing data, if there is lot of data and streams to sync, the first time the sync runs, it tries to pull all data for all streams, but doesn’t turn any of the streams over until all the streams data has been received. As the process goes long and longer, it seems to keep slowing down. Is there a setting to force airbyte to turn data faster or keep the stream sync from running endlessly? I don’t know if its the logs for that the long sync accumulating and slowing things down or what, but it definitely appears slower on a stream/record basis. The other thing the polling workers do is keep checking each stream for work, even after a stream is synced in queue and is waiting



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

["optimization", "data-sync", "API-transformer", "Stripe", "performance", "logs", "polling", "sync"]

Example of a sync job that has been running for more 13 hours with no end in sight.

<@U07R329BBH6> Have you seen anything like this with your connectors?