Airbyte kubernetes/helm deployment unable to configure external postgres

  • Is this your first time deploying Airbyte?: Yes
  • OS Version / Instance: Debian GNU/Linux 11 (bullseye)
  • Deployment: Kubernetes in AWS
  • Airbyte Version: 0.40.3
  • Step: Deployment to cluster
  • Description: Our team uses automated kubernetes deployments via flux. This is our first deployment of Airbyte utilizing the helm charts. Services are being deployed to our cluster, however both the worker and server pods are failing after connecting to a brand new external database (AWS RDS postgres instance). They both seem to fail after sharing that the minimum flyway version is not met, and repeating the same flyway message until the connection timeout is met. From my understanding, airbyte is supposed to configure brand new databases. However I’m not sure what issue it is running into at this point.

From the server logs:

kubectl --namespace=data-prod logs -f airbyte-server-5c46fff987-cx9qg
2022-09-14 21:03:46 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable CONFIG_DATABASE_USER: 'postgres'
2022-09-14 21:03:46 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable CONFIG_DATABASE_PASSWORD: '*****'
2022-09-14 21:03:46 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable CONFIG_DATABASE_URL: 'jdbc:postgresql://<redacted-aws-hostname>:5432/airbyte'
2022-09-14 21:03:46 INFO c.z.h.HikariDataSource(<init>):80 - HikariPool-1 - Starting...
2022-09-14 21:03:46 INFO c.z.h.HikariDataSource(<init>):82 - HikariPool-1 - Start completed.
2022-09-14 21:03:46 INFO c.z.h.HikariDataSource(<init>):80 - HikariPool-2 - Starting...
2022-09-14 21:03:46 INFO c.z.h.HikariDataSource(<init>):82 - HikariPool-2 - Start completed.
2022-09-14 21:03:47 WARN c.n.s.JsonMetaSchema(newValidator):338 - Unknown keyword existingJavaType - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2022-09-14 21:03:48 INFO i.a.s.ServerApp(getServer):178 - Checking databases..
2022-09-14 21:03:48 INFO i.a.s.ServerApp(assertDatabasesReady):154 - Checking configs database flyway migration version..
2022-09-14 21:03:48 WARN i.a.d.c.DatabaseAvailabilityCheck(check):38 - Waiting for database to become available...
2022-09-14 21:03:48 INFO i.a.d.c.DatabaseAvailabilityCheck(lambda$isDatabaseConnected$1):75 - Testing airbyte configs database connection...
2022-09-14 21:03:49 INFO i.a.d.c.DatabaseAvailabilityCheck(check):57 - Database available.
2022-09-14 21:03:49 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:03:49 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Database: jdbc:postgresql://<redacted-aws-hostname>:5432/airbyte (PostgreSQL 13.7)
2022-09-14 21:03:49 INFO i.a.d.c.DatabaseMigrationCheck(check):46 - Current database migration version 0.
2022-09-14 21:03:49 INFO i.a.d.c.DatabaseMigrationCheck(check):47 - Minimum Flyway version required 0.35.15.001.
2022-09-14 21:04:49 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:05:49 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:06:49 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:07:49 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:08:49 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:09:49 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:10:49 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:11:49 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:12:49 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:13:49 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:13:49 ERROR i.a.s.ServerApp(main):319 - Server failed
io.airbyte.db.check.DatabaseCheckException: Timeout while waiting for database to fulfill minimum flyway migration version..
	at io.airbyte.db.check.DatabaseMigrationCheck.check(DatabaseMigrationCheck.java:51) ~[io.airbyte.airbyte-db-db-lib-0.40.3.jar:?]
	at io.airbyte.server.ServerApp.assertDatabasesReady(ServerApp.java:158) ~[io.airbyte-airbyte-server-0.40.3.jar:?]
	at io.airbyte.server.ServerApp.getServer(ServerApp.java:179) ~[io.airbyte-airbyte-server-0.40.3.jar:?]
	at io.airbyte.server.ServerApp.main(ServerApp.java:316) [io.airbyte-airbyte-server-0.40.3.jar:?]
2022-09-14 21:13:49 INFO c.z.h.HikariDataSource(close):350 - HikariPool-1 - Shutdown initiated...
2022-09-14 21:13:49 INFO c.z.h.HikariDataSource(close):352 - HikariPool-1 - Shutdown completed.
2022-09-14 21:13:49 INFO c.z.h.HikariDataSource(close):350 - HikariPool-2 - Shutdown initiated...
2022-09-14 21:13:49 INFO c.z.h.HikariDataSource(close):352 - HikariPool-2 - Shutdown completed.

From the worker logs:

kubectl --namespace=data-prod logs -f airbyte-worker-6456df694c-ktd5n
2022-09-14 21:03:46 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:03:46 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable CONFIG_DATABASE_USER: 'postgres'
2022-09-14 21:03:46 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable CONFIG_DATABASE_PASSWORD: '*****'
2022-09-14 21:03:46 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable CONFIG_DATABASE_URL: 'jdbc:postgresql://<redacted-aws-hostname>:5432/airbyte'
2022-09-14 21:03:46 INFO c.z.h.HikariDataSource(<init>):71 - HikariPool-1 - Starting...
2022-09-14 21:03:46 INFO c.z.h.HikariDataSource(<init>):73 - HikariPool-1 - Start completed.
2022-09-14 21:03:46 INFO c.z.h.HikariDataSource(<init>):71 - HikariPool-2 - Starting...
2022-09-14 21:03:46 INFO c.z.h.HikariDataSource(<init>):73 - HikariPool-2 - Start completed.
2022-09-14 21:03:46 WARN i.a.d.c.DatabaseAvailabilityCheck(check):38 - Waiting for database to become available...
2022-09-14 21:03:46 INFO i.a.d.c.DatabaseAvailabilityCheck(lambda$isDatabaseConnected$1):75 - Testing airbyte configs database connection...
2022-09-14 21:03:47 INFO i.a.d.c.DatabaseAvailabilityCheck(check):57 - Database available.
2022-09-14 21:03:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:03:47 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:03:47 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Database: jdbc:postgresql://<redacted-aws-hostname>:5432/airbyte (PostgreSQL 13.7)
2022-09-14 21:03:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:03:47 INFO i.a.d.c.DatabaseMigrationCheck(check):46 - Current database migration version 0.
2022-09-14 21:03:47 INFO i.a.d.c.DatabaseMigrationCheck(check):47 - Minimum Flyway version required 0.35.15.001.
2022-09-14 21:04:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:04:47 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:04:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:05:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:05:47 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:05:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:06:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:06:47 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:06:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:07:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:07:47 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:07:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:08:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:08:47 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:08:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:09:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:09:47 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:09:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:10:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:10:47 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:10:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:11:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:11:47 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:11:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:12:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:12:47 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:12:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:13:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:13:47 INFO o.f.c.i.l.s.Slf4jLog(info):49 - Flyway Community Edition 7.14.0 by Redgate
2022-09-14 21:13:47 INFO i.a.c.EnvConfigs(getEnvOrDefault):1002 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-09-14 21:13:47 ERROR i.a.w.WorkerApp(main):561 - Worker app failed
io.airbyte.db.check.DatabaseCheckException: Timeout while waiting for database to fulfill minimum flyway migration version..
	at io.airbyte.db.check.DatabaseMigrationCheck.check(DatabaseMigrationCheck.java:51) ~[io.airbyte.airbyte-db-db-lib-0.40.3.jar:?]
	at io.airbyte.workers.WorkerApp.main(WorkerApp.java:549) [io.airbyte-airbyte-workers-0.40.3.jar:?]
2022-09-14 21:13:47 INFO c.z.h.HikariDataSource(close):347 - HikariPool-1 - Shutdown initiated...
2022-09-14 21:13:47 INFO c.z.h.p.HikariPool(shutdown):204 - HikariPool-1 - Close initiated...
2022-09-14 21:13:47 INFO c.z.h.p.HikariPool(shutdown):238 - HikariPool-1 - Closed.
2022-09-14 21:13:47 INFO c.z.h.HikariDataSource(close):349 - HikariPool-1 - Shutdown completed.
2022-09-14 21:13:47 INFO c.z.h.HikariDataSource(close):347 - HikariPool-2 - Shutdown initiated...
2022-09-14 21:13:47 INFO c.z.h.p.HikariPool(shutdown):204 - HikariPool-2 - Close initiated...
2022-09-14 21:13:47 INFO c.z.h.p.HikariPool(shutdown):238 - HikariPool-2 - Closed.
2022-09-14 21:13:47 INFO c.z.h.HikariDataSource(close):349 - HikariPool-2 - Shutdown completed.

Hello there! You are receiving this message because none of your fellow community members has stepped in to respond to your topic post. (If you are a community member and you are reading this response, feel free to jump in if you have the answer!) As a result, the Community Assistance Team has been made aware of this topic and will be investigating and responding as quickly as possible.
Some important considerations that will help your to get your issue solved faster:

  • It is best to use our topic creation template; if you haven’t yet, we recommend posting a followup with the requested information. With that information the team will be able to more quickly search for similar issues with connectors and the platform and troubleshoot more quickly your specific question or problem.
  • Make sure to upload the complete log file; a common investigation roadblock is that sometimes the error for the issue happens well before the problem is surfaced to the user, and so having the tail of the log is less useful than having the whole log to scan through.
  • Be as descriptive and specific as possible; when investigating it is extremely valuable to know what steps were taken to encounter the issue, what version of connector / platform / Java / Python / docker / k8s was used, etc. The more context supplied, the quicker the investigation can start on your topic and the faster we can drive towards an answer.
  • We in the Community Assistance Team are glad you’ve made yourself part of our community, and we’ll do our best to answer your questions and resolve the problems as quickly as possible. Expect to hear from a specific team member as soon as possible.

Thank you for your time and attention.
Best,
The Community Assistance Team

I haven’t received any feedback or response - is there more than I can include to help investigate the query? This is my first post to the forum, so I wouldn’t be surprised if I left out some expected resources that would prove useful. Happy to provide whatever I can to help debug.

I’m trying to understand if airbyte is actually expected to initialize the database and implement the migration to the right version, or if there is something I need to do on the DB side to get it to a state that airbyte expects. From the documentation, the former seems to be suggested but I can’t find a lot specifically related to Flyway or DB migration.

I can confirm that the services can connect to the DB (logs also suggest this), and I set the DB connection timeout to 10m in hopes of providing the server and worker long enough to do whatever they need before failing, but that doesn’t seem to help.

I recently deployed version 0.40.3 to my own kubernetes cluster with a new external database and encountered similar issues. In my case I needed to manually create two databases in postgres: temporal and temporal_visibility. Once I did that, everything worked.

Thank you for the response! I have been able to connect with a SQL client to see that those two DBs were automatically created, so I suppose that means I’m running into a different issue. I appreciate you sharing.

I changed the log level to debug for the server and worker pods, attached are the expanded logs. I’ve included from the beginning to the point that the logs seem to just repeat over and over.
airbyte-server-log-example.txt (80.3 KB)
airbyte-worker-log-example.txt (112.0 KB)

It does seem like something is going wrong with either the deploy or with the database, very weird that the DB version is not being picked up. Airbyte should initialize the tables in the airbyte database by itself, some more info: https://docs.airbyte.com/operator-guides/configuring-airbyte-db/#initializing-the-database

Could you check if there is no networking issue between the db and Airbyte? Also, here is a list of the db config options in case there is something you might need: https://docs.airbyte.com/operator-guides/configuring-airbyte/#database

Otherwise, I would recommend clearing out the airbyte db and re-deploying a fresh Airbyte to see if it is able to initialize.

Also, can you try upgrading Airbyte to see if that helps?

Thanks for the follow up @sh4sh , sorry for the late reply - I must’ve missed the notifications for this. Turns out it was likely some issue with our secrets configuration in the deployment itself, as some recent infra changes appears to have fixed the issue somehow. As of now, this problem is resolved!

Glad to hear you found the solution!