Issues with synchronization and taints in Airbyte deployment using Helm Chart

Summary

Facing issues with synchronization in Airbyte deployment due to taints problems when orchestrator pods are created. Unable to use nightly version with airbyte-bootloader.


Question

Hello everyone!
In the last few weeks, I’ve been facing issues when trying to start a synchronization. I’ve tested various versions from 0.62 to 1.1.0 (current).
I have some nodes with specific nodeSelectors and Tolerations that I pass to the Deployments, and they all work very well. However, when we try to start a synchronization and the orchestrator-pod-repl and replication-job-268-attempt-0 are created, they fail due to taints issues.
With the following messages:

0/14 nodes are available: 1 Insufficient memory, 1 node(s) had untolerated taint {people: true}, 2 node(s) had untolerated taint {initial_nodes: true}, 3 node(s) had untolerated taint {addons: true}, 8 Insufficient cpu. preemption: 0/14 nodes are available: 6 Preemption is not helpful for scheduling, 8 No preemption victims found for incoming pod.
Failed to schedule pod, incompatible with nodepool "MYNODE", daemonset overhead={"cpu":"265m","memory":"376Mi","pods":"5"}, did not tolerate people=true:NoSchedule; incompatible with nodepool "addons-spot", daemonset overhead={"cpu":"265m","memory":"376Mi","pods":"5"}, did not tolerate addons=true:NoSchedule```
What can I do?

*I'm deploying using Helm Chart, version 1.1.0, and I wasn’t able to use the 1.1.0 nightly version because the airbyte-bootloader doesn’t recognize the nightly version string.*

<br>

---

This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. [Click here](https://airbytehq.slack.com/archives/C021JANJ6TY/p1728587996339629) if you want 
to access the original thread.

[Join the conversation on Slack](https://slack.airbyte.com)

<sub>
["synchronization", "taints-issues", "helm-chart", "nightly-version", "orchestrator-pods", "airbyte-deployment"]
</sub>

Sounds like you’re possibly hitting a known bug with tolerations not being passed through to job pods. There are a couple open issues:
https://github.com/airbytehq/airbyte/issues/45903
https://github.com/airbytehq/airbyte/issues/28389
I hope this was fixed today by https://github.com/airbytehq/airbyte-platform/commit/2ca3c4192793b15a1ccc2bfd644dd725c3a2903c

> airbyte-bootloader doesn’t recognize the nightly version string
Can you clarify?

<@U07FH2Y34A1> Thanks for the quick response, i’ve looked in the code of tryangul from 2w ago and expected that it was fixed.

About the airbyte-bootloader, maybe it is some human-error(me), but:
When passing version nightly to the helm like this
--version 1.1.0-nightly-1728515229-ac8ee10310 \

it shows this error while starting airbyte-bootloader pod:
Caused by: java.lang.IllegalArgumentException: Invalid version string: nightly-1728515229-ac8ee10310

Hm, interesting. We’ve recently introduced nightly versions, so we’ll look into that.

Any chance you could give me a full stacktrace from the logs? That might help track this bug down.

Yeah, i’ll look it now

                --install \
                --version 1.1.0-nightly-1728515229-ac8ee10310 \
                -n ppl-app-airbyte-dev \
                -f ./manifests/charts/dev/airbyte.1.1.0-nightly-1728515229-ac8ee10310.yml \
                dep-airbyte airbyte/airbyte ```
*Result:*
```Error: UPGRADE FAILED: pre-upgrade hooks failed: 1 error occurred:
        * pod dep-airbyte-airbyte-bootloader failed```

*Pod airbyte-bootloader-container.log:*
___    _      __          __

/ | ()/ / __ / /
/ /| | / / / __ / / / / __/ _
/ ___ |/ / / / /
/ / /
/ / /
/ /
// |/// /.
/_, /_/___/
/____/
: airbyte-bootloader :

2024-10-11 19:43:17,118 [main] [34mINFO[0;39m i.m.c.e.DefaultEnvironment(<init>):168 - Established active environments: [k8s, cloud]
2024-10-11 19:43:17,930 [main] [34mINFO[0;39m c.z.h.HikariDataSource(<init>):79 - HikariPool-1 - Starting…
2024-10-11 19:43:17,968 [main] [34mINFO[0;39m c.z.h.HikariDataSource(<init>):81 - HikariPool-1 - Start completed.
2024-10-11 19:43:18,715 [main] [31mWARN[0;39m i.m.c.i.MeterRegistry$Config(logWarningAboutLateFilter):851 - A MeterFilter is being configured after a Meter has been registered to this registry. All MeterFilters should be configured before any Meters are registered. If that is not possible or you have a use case where it should be allowed, let the Micrometer maintainers know at https://github.com/micrometer-metrics/micrometer/issues/4920. Enable DEBUG level logging on this logger to see a stack trace of the call configuring this MeterFilter.
2024-10-11 19:43:18,736 [main] [34mINFO[0;39m c.z.h.HikariDataSource(<init>):79 - HikariPool-2 - Starting…
2024-10-11 19:43:18,737 [main] [34mINFO[0;39m c.z.h.HikariDataSource(<init>):81 - HikariPool-2 - Start completed.
2024-10-11 19:43:18,797 [main] [34mINFO[0;39m i.m.l.PropertiesLoggingLevelsConfigurer(configureLogLevelForPrefix):113 - Setting log level ‘INFO’ for logger: ‘io.netty’
2024-10-11 19:43:19,342 [main] [34mINFO[0;39m i.m.l.PropertiesLoggingLevelsConfigurer(configureLogLevelForPrefix):113 - Setting log level ‘ERROR’ for logger: ‘com.zaxxer.hikari’
2024-10-11 19:43:19,343 [main] [34mINFO[0;39m i.m.l.PropertiesLoggingLevelsConfigurer(configureLogLevelForPrefix):113 - Setting log level ‘INFO’ for logger: ‘io.grpc’
2024-10-11 19:43:19,343 [main] [34mINFO[0;39m i.m.l.PropertiesLoggingLevelsConfigurer(configureLogLevelForPrefix):113 - Setting log level ‘INFO’ for logger: ‘io.temporal’
2024-10-11 19:43:19,344 [main] [34mINFO[0;39m i.m.l.PropertiesLoggingLevelsConfigurer(configureLogLevelForPrefix):113 - Setting log level ‘ERROR’ for logger: ‘com.zaxxer.hikari.pool’
2024-10-11 19:43:19,345 [main] [34mINFO[0;39m i.m.l.PropertiesLoggingLevelsConfigurer(configureLogLevelForPrefix):113 - Setting log level ‘INFO’ for logger: ‘io.fabric8.kubernetes.client’
2024-10-11 19:43:21,541 [main] [34mINFO[0;39m i.m.r.Micronaut(start):101 - Startup completed in 6500ms. Server Running: http://dep-airbyte-airbyte-bootloader:9002
2024-10-11 19:43:22,353 [main] [34mINFO[0;39m i.a.f.ConfigFileClient(<init>):113 - path /flags does not exist, will return default flag values
2024-10-11 19:43:22,396 [main] [31mWARN[0;39m i.a.m.l.MetricClientFactory(initialize):72 - MetricClient was not recognized or not provided. Accepted values are datadog or otel.
2024-10-11 19:43:22,784 [main] [1;31mERROR[0;39m i.a.b.Application(main):25 - Unable to bootstrap Airbyte environment.
io.micronaut.context.exceptions.BeanInstantiationException: Error instantiating bean of type [io.airbyte.bootloader.Bootloader]

Message: Invalid version string: nightly-1728515229-ac8ee10310
Path Taken: new Bootloader(boolean autoUpgradeConnectors,ConfigRepository configRepository,DatabaseInitializer configsDatabaseInitializer,DatabaseMigrator configsDatabaseMigrator,AirbyteVersion currentAirbyteVersion,DatabaseInitializer jobsDatabaseInitializer,DatabaseMigrator jobsDatabaseMigrator,JobPersistence jobPersistence,OrganizationPersistence organizationPersistence,ProtocolVersionChecker protocolVersionChecker,boolean runMigrationOnStartup,String defaultRealm,PostLoadExecutor postLoadExecution) –> new Bootloader(boolean autoUpgradeConnectors,ConfigRepository configRepository,DatabaseInitializer configsDatabaseInitializer,DatabaseMigrator configsDatabaseMigrator,[AirbyteVersion currentAirbyteVersion],DatabaseInitializer jobsDatabaseInitializer,DatabaseMigrator jobsDatabaseMigrator,JobPersistence jobPersistence,OrganizationPersistence organizationPersistence,ProtocolVersionChecker protocolVersionChecker,boolean runMigrationOnStartup,String defaultRealm,PostLoadExecutor postLoadExecution)
at io.micronaut.context.DefaultBeanContext.resolveByBeanFactory(DefaultBeanContext.java:2345)
at io.micronaut.context.DefaultBeanContext.doCreateBean(DefaultBeanContext.java:2300)
at io.micronaut.context.DefaultBeanContext.doCreateBean(DefaultBeanContext.java:2312)
at io.micronaut.context.DefaultBeanContext.createRegistration(DefaultBeanContext.java:3123)
at io.micronaut.context.SingletonScope.getOrCreate(SingletonScope.java:80)
at io.micronaut.context.DefaultBeanContext.findOrCreateSingletonBeanRegistration(DefaultBeanContext.java:3025)
at io.micronaut.context.DefaultBeanContext.resolveBeanRegistration(DefaultBeanContext.java:2986)
at io.micronaut.context.DefaultBeanContext.resolveBeanRegistration(DefaultBeanContext.java:2752)
at io.micronaut.context.DefaultBeanContext.getBean(DefaultBeanContext.java:1745)
at io.micronaut.context.AbstractBeanResolutionContext.getBean(AbstractBeanResolutionContext.java:89)
at io.micronaut.context.AbstractInitializableBeanDefinition.resolveBean(AbstractInitializableBeanDefinition.java:2188)
at io.micronaut.context.AbstractInitializableBeanDefinition.getBeanForConstructorArgument(AbstractInitializableBeanDefinition.java:1350)
at io.airbyte.bootloader.$Bootloader$Definition.instantiate(Unknown Source)
at io.micronaut.context.DefaultBeanContext.resolveByBeanFactory(DefaultBeanContext.java:2330)
at io.micronaut.context.DefaultBeanContext.doCreateBean(DefaultBeanContext.java:2300)
at io.micronaut.context.DefaultBeanContext.doCreateBean(DefaultBeanContext.java:2312)
at io.micronaut.context.DefaultBeanContext.createRegistration(DefaultBeanContext.java:3123)
at io.micronaut.context.SingletonScope.getOrCreate(SingletonScope.java:80)
at io.micronaut.context.DefaultBeanContext.findOrCreateSingletonBeanRegistration(DefaultBeanContext.java:3025)
at io.micronaut.context.DefaultBeanContext.resolveBeanRegistration(DefaultBeanContext.java:2986)
at io.micronaut.context.DefaultBeanContext.resolveBeanRegistration(DefaultBeanContext.java:2752)
at io.micronaut.context.DefaultBeanContext.getBean(DefaultBeanContext.java:1745)
at io.micronaut.context.DefaultBeanContext.getBean(DefaultBeanContext.java:842)
at io.micronaut.context.DefaultBeanContext.getBean(DefaultBeanContext.java:834)
at io.airbyte.bootloader.Application.main(Application.java:21)
Caused by: java.lang.IllegalArgumentException: Invalid version string: nightly-1728515229-ac8ee10310
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:143)
at io.airbyte.commons.version.Version.<init>(Version.java:37)
at io.airbyte.commons.version.AirbyteVersion.<init>(AirbyteVersion.java:15)
at io.airbyte.micronaut.config.AirbyteConfigurationBeanFactory.airbyteVersion(AirbyteConfigurationBeanFactory.java:32)
at io.airbyte.micronaut.config.$AirbyteConfigurationBeanFactory$AirbyteVersion0$Definition.instantiate(Unknown Source)
at io.micronaut.context.DefaultBeanContext.resolveByBeanFactory(DefaultBeanContext.java:2330)
… 24 common frames omitted

2024-10-11 19:43:22,802 [Thread-6] [34mINFO[0;39m i.m.r.Micronaut(lambda$start$0):118 - Embedded Application shutting down```

Hey Alex!
Do we have some roadmap or discussion in progress about this problem with airbyte-bootloader?
Temporary i will go to an alternative instead of airbyte, but i want to use airbyte in the future again

We have someone looking at it, yes.