Crashing, freezing, corruption / lost data

  • Is this your first time deploying Airbyte?: Yes

  • OS Version / Instance: Amazon Linux t2.xlarge

  • Memory / Disk: you can use something like 16Gb / 8 GB

  • Deployment: Are you using Docker or Kubernetes deployment? No

  • Airbyte Version: What version are you using now? latest (v0.40.28?)

  • Source name/version: Jira, Salesforce

  • Destination name/version: Redshift

  • Step: The issue is happening during sync, creating the connection or a new source? During or after sync

  • Description:

I’m doing a feasibility spike on Airbyte’s ability to schedule ETL from Jira and Salesforce to Redshift. I’m an Airbyte beginner and reporting this live, the EC2 instance is currently stopped.

On a t2.xlarge, Airbyte is able to create the tables in Redshift and seems to finish sync’ing them.

However the Airbyte UI is unstable as is the container. On t2.medium it freezes the entire OS, requiring reboot, so I switched to t2.xlarge. On t2.xlarge it can finish a sync but eventually corrupts and the configurations are lost.

I left my instance running overnight after a successful Salesforce and Jira ETL, but this morning, the http UI only responded with “Oops, something went wrong”. I restarted docker service, and when it restarted all previous configurations in the container had disappeared. It welcomed me and invited me to sign up for marketing emails and set up my first connection.

I now have my tables in Redshift but Airbyte has lost all my sources and destinations. Based on my experience so far, I can’t imagine Airbyte running stably for more than a few hours in this environment.

I’m not even sure what I’m trying to ask, other than is there a better way to run this in a stable environment? Airbyte is a marvel and in many ways this is everything we’re looking for, but something this unstable is a non-starter for production.

Thanks for any help!

How many disk are using for those machines? Airbyte has an internal database and store logs in disk too. I strongly recommend something like 100Gb to not have any issues.
Besides sharing the server logs can give us some ideal of what happened, you can get them running docker logs airbyte-server > server.log it will create a file called server.log in the current folder.

Also, see in AWS EC2 console the resource consumption. Maybe there you can see what is causing the problem.

Thanks for the tips! I will look into your suggestions. Re. the above, this should be in the docs, which only recommend a t2.medium for Airbyte. We’re only importing three tables in testing and already exceed the “minimum” recommendations!

Yep, I think the minimum recommendation must be updated! Thanks, let me know if you have solved your issues

Thank you for your help! Simply upgrading the attached drive to 100 gb solved the issues we were having with stability. It’s working quite well now.