Airbyte bootloader error when deploying with Helm in EKS

Summary

Error encountered when deploying Airbyte with Helm in EKS related to bootloader


Question

Hello Everyone,
Iam getting Airbyte bootloader error when i try to deploy airbyte with helm in EKS. Can you please help with this?



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

["airbyte", "bootloader-error", "helm", "eks"]

Hi <@U05JENRCF7C> So without Creating RDS and S3 in AWS, Can’t we deploy it as it is opensource?

<@U07KAKAH70Q> that’s not what I wrote. You need to figure out how to solve the problem with PersistentVolumeClaim I pointed out. I don’t know how to do that. I suggested some sources that might be useful in solving it or workaround (RDS, S3) for this problem.

Got it <@U05JENRCF7C> , i want to know other one is that - How many minimum nodes require actually to deploy an Airbyte Application for minimal application or production application?

It depends™ :wink:

Technically, 1 node, 16GB, 8vCPU should work for minimal deployment, but for production I wouldn’t recommend it

You can check other threads touching similar topics
https://airbytehq.slack.com/archives/C021JANJ6TY/p1723038177877849?thread_ts=1723031106.067149&amp;cid=C021JANJ6TY
https://airbytehq.slack.com/archives/C021JANJ6TY/p1723147302748119?thread_ts=1723145754.025749&amp;cid=C021JANJ6TY

Back to answering your question,
when it comes to production, please don’t think about it as a static configuration. It may and probably will change over time.
In my current project we did few things:
• we separated Airbyte core components (like server, webapp and so on) from jobs that do synchronizations, so heavy synchronization won’t affect Airbyte server stability
• we configured limits/requests for most pods, so the pods won’t affect each other when they hit resources hard limit for a node
I’d recommend starting with 2 nodes and monitoring how system behaves. In EKS you can configure node groups; kubernetes supports affinity/anti-affinity, so you can experiment how to schedule pods to have stable environment and minimize costs. Monitor resources (cpu/memory) usage, so you can adjust size of instances. Also, autoscaling and Fargate are useful to handle synchronizations – spawn nodes for synchronization and terminate them when job is done.

<@U05JENRCF7C> So we can deploy in EC2 Instance with 16GB, 8vCPU rather than kubernetes cluster. Is there any documentation to follow to deploy in EC2 instance which deploys whole airbyte application and its components?

Yes, you can do that with abctl. I don’t recommend it for production setup, but I warned you.
https://docs.airbyte.com/using-airbyte/getting-started/oss-quickstart

Also, what output do you get for
kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get --show-kind --ignore-not-found
?

https://docs.google.com/document/d/1aORgfvri2CTGRa3YRUDDx3ypASpLFWVkqY73VLW-LLw/edit?usp=sharing

Please find the document <@U05JENRCF7C>

How many Nodes we need to deploy Airbyte Application?

Is this Deployment Doc is correct way to deploy airbyte in AWS EKS?

Problem is here Warning FailedScheduling 2m49s (x3 over 8m3s) default-scheduler 0/2 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
Definitely it is an issue with PersistentVolume and PersistentVolumeClaim

I searched Airbyte’s Slack for pod has unbound immediate PersistentVolumeClaims but I haven’t found any solutions.
One option that I mentioned is to configure using S3 and external PostgreSQL in RDS
https://docs.airbyte.com/deploying-airbyte/integrations/storage
https://docs.airbyte.com/deploying-airbyte/integrations/database

You can also check this Stack Overflow’s thread https://stackoverflow.com/questions/52668938/pod-has-unbound-persistentvolumeclaims
I don’t remember issues with minio and db, but in my team, we switched quite fast to S3 and PostgreSQL RDS

I don’t want to mislead you, but you might also check if EKS addons like: aws-ebs-csi-driver, aws-efs-csi-driver might be useful in this situation

Sure Thankyou <@U05JENRCF7C>

More interesting is a state of airbyte-db-0 and airbyte-minio-0

What do you get for
kubectl describe pod airbyte-db-0
kubectl describe pod airbyte-minio-0 ?

if there are any logs for those pods, you can attach them as well

Hey <@U05JENRCF7C> Thanks for responding.
Iam not getting any logs for both!!
kubectl describe pod airbyte-db-0
kubectl describe pod airbyte-minio-0

Here is the logs for Airbyte Bootloader Pod.

What output do you get for
kubectl describe pod airbyte-db-0
kubectl describe pod airbyte-minio-0 ?

sidenote: you can also configure using S3 and external PostgreSQL in RDS
https://docs.airbyte.com/deploying-airbyte/integrations/storage
https://docs.airbyte.com/deploying-airbyte/integrations/database
instead of those two pods

<@U05JENRCF7C> are you saying that i should configure my eks cluster to use (ec2) nodes for my airbyte services/pods (anything that is not a synchronisation containers) and fargate for the syncronization (source/destination conatiners?

<@U04P0677B4Z> nope, you shouldn’t do what some random guy from internet says :wink:
This setup works for me and it’s quite cost effective, but your needs may vary. The most beneficial part is separating Airbyte core components from synchronization pods, and defining requests/limits for resources. It improved stability of my setup.

If you want to check Fargate for synchronization pods, please set those values for pod-sweeper to clean Fargate containers as soon as possible when job is done (or something failed)

  enabled: true
  timeToDeletePods:
    succeeded: 1
    unsuccessful: 1```