Clearing storage area in destination increasingly slow on long-lived airbyte instances

  • Is this your first time deploying Airbyte?: No
  • OS Version / Instance: Oracle Linux 8 Ec2 Instance
  • Memory / Disk: 32GB / 100GB
  • Deployment: Docker
  • Airbyte Version: 0.40.22
  • Source name/version: Postgres 1.0.33
  • Destination name/version: S3 0.3.17
  • Step: The issue is happening upon re-adding a connection and performing the reset of streams for a connection
  • Description:

There are occasions where I will wish to delete and create a number of connections, this results in a new initial sync of data as desired, however, when attempting to setup a connection on airbyte EC2 instance that has been running for some time, it takes longer and longer for the ‘Clearing storage area in destination’ step to complete during the stream reset. I’ve seen this take over 4 hours for one database with around 666 tables, yet if I rebuild the airbyte environment by destroying the EC2 instance and configuring it via ansible, This step took less than a minute.

I’ve tried manually ensuring the destination bucket is empty so that this step does not have to clean up any data in the destination, however, this made no difference and I believe this step relates to the local data/preparation.

Here is an excerpt from the time it took over 4 hours, note how it attempts to clear a destination many times over and the time between attempts;

2023-01-03 09:47:40 destination > class io.airbyte.integrations.destination.buffered_stream_consumer.BufferedStreamConsumer started.
2023-01-03 09:47:40 destination > Preparing bucket in destination started for 668 streams
2023-01-03 09:47:40 destination > Clearing storage area in destination started for namespace db4-chandos stream daysofweek bucketObject / pathFormat //${NAMESPACE}/${STREAM_NAME}/${YEAR}_${MONTH}_${DAY}_${EPOCH}_
2023-01-03 09:47:42 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:47:44 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:47:45 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:47:47 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:47:48 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:47:50 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:47:52 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:47:53 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:47:55 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:47:56 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:47:58 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:47:59 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:01 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:02 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:04 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:06 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:07 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:09 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:09 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/daysofweek/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:09 destination > Clearing storage area in destination completed for namespace db4-chandos stream daysofweek bucketObject /
2023-01-03 09:48:09 destination > Clearing storage area in destination started for namespace db4-chandos stream jacs bucketObject / pathFormat //${NAMESPACE}/${STREAM_NAME}/${YEAR}_${MONTH}_${DAY}_${EPOCH}_
2023-01-03 09:48:11 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:12 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:14 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:16 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:17 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:19 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:20 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:22 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:23 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:25 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:26 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:28 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:30 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:31 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:33 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:34 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:36 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:37 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...
2023-01-03 09:48:38 destination > Storage bucket / has been cleaned-up (0 objects matching /db4-chandos/jacs/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]+_.* were deleted)...

Hello there! You are receiving this message because none of your fellow community members has stepped in to respond to your topic post. (If you are a community member and you are reading this response, feel free to jump in if you have the answer!) As a result, the Community Assistance Team has been made aware of this topic and will be investigating and responding as quickly as possible.
Some important considerations that will help your to get your issue solved faster:

  • It is best to use our topic creation template; if you haven’t yet, we recommend posting a followup with the requested information. With that information the team will be able to more quickly search for similar issues with connectors and the platform and troubleshoot more quickly your specific question or problem.
  • Make sure to upload the complete log file; a common investigation roadblock is that sometimes the error for the issue happens well before the problem is surfaced to the user, and so having the tail of the log is less useful than having the whole log to scan through.
  • Be as descriptive and specific as possible; when investigating it is extremely valuable to know what steps were taken to encounter the issue, what version of connector / platform / Java / Python / docker / k8s was used, etc. The more context supplied, the quicker the investigation can start on your topic and the faster we can drive towards an answer.
  • We in the Community Assistance Team are glad you’ve made yourself part of our community, and we’ll do our best to answer your questions and resolve the problems as quickly as possible. Expect to hear from a specific team member as soon as possible.

Thank you for your time and attention.
Best,
The Community Assistance Team

Hi @Billy_Hudson,

Thanks for bringing this to our attention! I know we’re trying to speed up all our processes right now, though I haven’t heard of problems with this particular one. Could you please open a GitHub issue for this with your findings? I believe that would be the best way to prioritize a fix for this.

I’ve created a github issue for this here;