Deploy EC2: Slow runtime on t3.micro

slunia · August 22, 2022, 1:25pm

Hi we tried installing airbyte on an ec2 instance : (t3.micro) - basis the documentation. the installation goes smoothly anf for the first 5 mins we are able to add sources / destiations. however post that everything becomes super slow and freezes. we cant even ssh into the instance. and need to restart. which also allows repeats the same issue after 5 mins. any suggestion / advice where we are going wrong ? the same worked perfectly on docker desktop in my mac.

–>when i monitor cpu % , its still below the 50%.

is there a minimum instance type requirement ? without which this wont work ?

sajarin · August 24, 2022, 6:48am

Hi @slunia, a t3.micro is not recommended. According to our documentation: we recommend a t2.medium for testing or t2.large for production.

https://docs.airbyte.com/deploying-airbyte/on-aws-ec2/#:~:text=For%20testing%20out%20Airbyte%2C%20a%20t2.medium%20instance%20is%20likely%20sufficient.

lucaswiley · August 24, 2022, 11:19pm

I am on a t2.xlarge at 50% CPU and having the same issues.

I am running an incremental+deduped backfill for 24 hours on Postgres > Snowflake as well. Tried tweaking the workers in .env but hasn’t seemed to help.

Any ideas on what else I can do to speed up that runtime?

lucaswiley · August 25, 2022, 3:32am

I read the issues about FetchSize (Investigate the performance bottleneck of source database connectors · Issue #12532 · airbytehq/airbyte · GitHub), seems like if that defaulted to being dynamic but could be overwritten to a set value we might be able to help.

I don’t know the tradeoffs however it looks like these long-running (now failed) streams are spending most of the time doing:
2022-08-25 03:22:24 INFO i.a.w.g.DefaultReplicationWorker(lambda$getReplicationRunnable$6):329 - Records read: 6909000 (2 GB)

lucaswiley · September 7, 2022, 5:08pm

As a follow-up I have noticed that per Airbyte’s recommendation, parallelizing your connections to one connection per source table helps with overall performance (even fetching), though it does utilize more CPU.

sajarin · September 22, 2022, 7:08pm

Hey @lucaswiley,

Apologies for the delay, thanks for following up with what worked for you. We’re definitely still trying to improve our database connectors. Our general recommendation is to parallelize large syncs as different tables can impact performance disproportionately. Feel free to post more questions in our forums here!

josephbrownskilljar · January 20, 2023, 10:20pm

We are testing on an ec2 t2.medium, and it freezes every time after about 10 minutes, when running a simple import of three tables from Salesforce into Redshift.

Are there better guidelines available at this time? Does anyone have advice on what capacity and what kind of EC2 instance would work with a simple import?

It’s not exactly straightforward to increase the capacity of an EC2 instance. This is a promising tool but if we cannot ascertain its minimum requirements it’s going to be difficult to convince the team to continue putting resources into this. Thanks for any help!

Topic		Replies	Views
Recommended EC2 instance type for Airbyte Platform Questions platform , airbyte-platform , question , ec2-instance , t2	2	247	August 8, 2024
Recommended AWS EC2 instance type for running Airbyte OSS version 1.1.0 Platform Questions platform , airbyte-oss , question , abctl , recommended	2	239	October 29, 2024
EC2 <-> Airbyte load expectation Platform, Deploy & Infra Issues source-mysql , data-loading , deploy	1	246	June 9, 2023
Crashing, freezing, corruption / lost data Connector Questions & Issues getting-started	5	1356	January 26, 2023
Slow performance on Airbyte installed on EC2 instance using ABCTL Platform Questions platform , airbyte , ec2-instance , installation , abctl	6	90	September 15, 2024

Deploy EC2: Slow runtime on t3.micro

Related topics