K8S Production reliable deployement

Hello there :wave:

I’m having a difficult time working with my Platform team in the Deployement of Airbyte in our EKS infrastructure.

The multiple points of incertitude they are raising are the following :

  • Are Shared volumes something you aim to end using ? As I understood, you are using them mainly to share configs and some events but is one of your goal is to move this to a stateful deployement ? The fear is how to handle upgrades and handling an additional component in our infra.
  • As all services are showed with a replica set to 1, can they scale horizontally ?
  • Network policies are in separate files and seems hard to maintain if future deployements require more open ports
  • Logging doesn’t seem to be directed to something else than S3 or GCP, is outpuitting the logs to classic stdout something we can parameter ?
  • Do you have an ETA for K8S to go out of Beta ?

Thanks for your help

Hello @BenoitHugonnard,
I reached out to our platform team to give you some insights about these questions.

Are Shared volumes something you aim to end using? As I understood, you are using them mainly to share configs and some events but is one of your goals is to move this to a stateful deployment? The fear is how to handle upgrades and handling an additional component in our infra.

Could you please be more accurate about which shared volume you refer to?

As all services are showed with a replica set to 1, can they scale horizontally ?

Everything can scale horizontally except for the scheduler and the server. Horizontally scaling the server works except for the import/export function. Temporal does not need to scale horizontally.

Network policies are in separate files and seem hard to maintain if future deployments require more open ports

This is related to our community Helm chart current status. We will be looking at improving our Helm deployment in the next quarter.

Logging doesn’t seem to be directed to something else than S3 or GCP, is outputting the logs to classic stdout something we can parameter ?

Airbyte needs a source of truth of logs, so we write to immutable object storage. The logs are simultaneously output on stdout, so you can install other logging collectors on the nodes and pipe them to other logging solutions. Curious, what workflow is missing with this current logging setup?

Do you have an ETA for K8S to go out of Beta?

We can’t share an accurate ETA at the moment. It’ll probably be once we fix up the community helm chart and make it as stable as our internal helm chart for Airbyte Cloud.

Thank you !

Shared Volumes

I’m referring to all the 2 shared volumes you’re using on docker-compose (workspace and data) but on Helm I’m only seeing one. I’d like to know if the usage of SharedVolume is something you want to stick to or something you would gladly remove.

Scale + Network policy + ETA

Thanks for the answer, looking forward to that !

Logging

Our team is using DataDog to handle all logging and collecting logs from the standard outputs. I’m guessing you need the logs to go to S3 to allow all webservers to be able to show the logs from the workers and even once you restarted the application, correct ? In that case that’s fine and I get why it’s necessary (as long as it also outputs to standard output but you seem to tell me so).