Configuring Airbyte for Production on GKE with Terraform

Summary

User seeks best practices for configuring Airbyte on GKE for production, specifically transitioning from MinIO to GCS and from default Postgres to Cloud SQL, using Terraform.


Question

Hello, guys! I need some help.
Recently, I installed Airbyte on GKE (GCP Cloud) using Helm to test the application in my company, and it worked! However, over time, I added several connections, and the default configurations stopped working properly. For example, MinIO was filling up every day, and I had to clean it daily.
But that’s okay; for a POC, it worked. Now, I need to set up Airbyte on GKE for production, using Terraform. I want to avoid using the default MinIO and instead use GCS. I also need to change the default Postgres to Cloud SQL.
What are the best practices I should follow to configure Airbyte for production, beyond switching from MinIO to GCS and from the default Postgres to Cloud SQL? And how can I do this? I searched the documentation but only found instructions for installation with Helm. I’m not sure how to configure external tools like GCS and Cloud SQL.



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

['airbyte', 'gke', 'terraform', 'gcs', 'cloud-sql', 'minio', 'postgres', 'production']

Have you checked docs?
https://docs.airbyte.com/deploying-airbyte/infrastructure/gcp
https://docs.airbyte.com/deploying-airbyte/integrations/storage
https://docs.airbyte.com/deploying-airbyte/integrations/database

you may also want to consider using Secret Manager, as it’s a bit of a pain to change this later on (by default the secrets are stored in the Airbyte DB).

We’re deployed on GKE Autopilot and use the full Google stack like you’re describing and things have worked quite well. The biggest pain point is deciding how to configure and maintain your ingress—I’d recommend basically not trying to fiddle with the annotations for this and just using terraform for it directly, separate from Airbyte’s deployment process. Otherwise it can be a bit fragile when using other features (e.g. Cloud IAP, Google-managed certs, etc.)

Do be sure that however you configure your ingress, set the timeout quite high (e.g. 600 or 1200 seconds), as otherwise things like connection checks while creating a new source can fail with a gateway timeout.

Happy to help with config reference and such if you get stuck

you may also want to consider using Secret Manager, as it’s a bit of a pain to change this later on (by default the secrets are stored in the Airbyte DB).

We’re deployed on GKE Autopilot and use the full Google stack like you’re describing and things have worked quite well. The biggest pain point is deciding how to configure and maintain your ingress—I’d recommend basically not trying to fiddle with the annotations for this and just using terraform for it directly, separate from Airbyte’s deployment process. Otherwise it can be a bit fragile when using other features (e.g. Cloud IAP, Google-managed certs, etc.)

Do be sure that however you configure your ingress, set the timeout quite high (e.g. 600 or 1200 seconds), as otherwise things like connection checks while creating a new source can fail with a gateway timeout.

Happy to help with config reference and such if you get stuck