Summary
After upgrading an Airbyte deployment on GCP, sync jobs are timing out without starting. Looking for assistance in resolving the issue.
Question
I recently upgraded an OSS deployment running with Helm/k8s on GCP from ~0.5x to 1.20, ensuring that outdated configuration was removed from the helm values and new requirements were satisfied. All of the services seem to be running correctly, and I can access the webapp, but all sync jobs seem to time out without ever starting. I’ll throw a few potentially-related errors in the thread in the hopes that someone has seen something similar. Any ideas?
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.
Join the conversation on Slack
["sync-jobs", "upgrade", "GCP", "timeout", "helm", "kubernetes", "configuration"]
This WARN i.a.c.s.c.JobConverter(getWorkspaceId):403 - Unable to retrieve workspace ID for job null
seems to occur on any sync job trigger
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container java.lang.NullPointerException: null
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:904)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at com.google.common.cache.LocalCache.get(LocalCache.java:4016)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4040)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4989)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.airbyte.persistence.job.WorkspaceHelper.lambda$getWorkspaceForJobId$4(WorkspaceHelper.java:162)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.airbyte.persistence.job.WorkspaceHelper.handleCacheExceptions(WorkspaceHelper.java:231)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.airbyte.persistence.job.WorkspaceHelper.getWorkspaceForJobId(WorkspaceHelper.java:162)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.airbyte.commons.server.converters.JobConverter.getWorkspaceId(JobConverter.java:401)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.airbyte.commons.server.converters.JobConverter.getAttemptLogs(JobConverter.java:320)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.airbyte.commons.server.converters.JobConverter.getSynchronousJobRead(JobConverter.java:349)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.airbyte.commons.server.handlers.ConnectorDefinitionSpecificationHandler.getSourceSpecificationRead(ConnectorDefinitionSpecificationHandler.java:145)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.airbyte.commons.server.handlers.ConnectorDefinitionSpecificationHandler.getSpecificationForSourceId(ConnectorDefinitionSpecificationHandler.java:76)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.airbyte.server.apis.SourceDefinitionSpecificationApiController.lambda$getSpecificationForSourceId$1(SourceDefinitionSpecificationApiController.java:46)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.airbyte.server.apis.ApiHelper.execute(ApiHelper.kt:29)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.airbyte.server.apis.SourceDefinitionSpecificationApiController.getSpecificationForSourceId(SourceDefinitionSpecificationApiController.java:46)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.airbyte.server.apis.$SourceDefinitionSpecificationApiController$Definition$Exec.dispatch(Unknown Source)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.micronaut.context.AbstractExecutableMethodsDefinition$DispatchedExecutableMethod.invokeUnsafe(AbstractExecutableMethodsDefinition.java:461)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.micronaut.context.DefaultBeanContext$BeanContextUnsafeExecutionHandle.invokeUnsafe(DefaultBeanContext.java:4350)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.micronaut.web.router.AbstractRouteMatch.execute(AbstractRouteMatch.java:272)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.micronaut.web.router.DefaultUriRouteMatch.execute(DefaultUriRouteMatch.java:38)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.micronaut.http.server.RouteExecutor.executeRouteAndConvertBody(RouteExecutor.java:498)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.micronaut.http.server.RouteExecutor.lambda$callRoute$5(RouteExecutor.java:475)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.micronaut.core.execution.ExecutionFlow.lambda$async$1(ExecutionFlow.java:87)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at io.micronaut.core.propagation.PropagatedContext.lambda$wrap$3(PropagatedContext.java:211)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
airbyte-server-bbd99c6b7-5dmn5:airbyte-server-container at java.base/java.lang.Thread.run(Thread.java:1583)```
Do you use taints and tolerations in kubernetes for Airbyte?
nope — it’s a pretty basic setup
Is this on your own cluster or GKE?
I’m also seeing errors like 405 - Failure during reporting of activity result to the server. ActivityId = ee3d7b55-87d5-3c8a-9b1c-48b2bed6bdeb, ActivityType = RunWithWorkload, WorkflowId=check_6291_source, WorkflowType=CheckConnectionWorkflow, RunId=a9dbb574-5c3c-4def-ae20-da392ca73715
and warnings for WARN i.a.c.s.h.h.StatsAggregationHelper(hydrateWithStats):150 - Missing stats for job 6293 attempt 0
oh, good thought, but sadly no. The role and rolebinding appear to be in place.
hm. I’d also look if maybe there are any errors being thrown by the bootloader, especially related to SQL (including Temporal)
okay, it looks like the newer versions of the helm chart default to storing secrets in a k8s secret called {deployment-name}-airbyte-secrets
but sync job pods are still being created with references to airbyte-config-secrets
, which no longer exists
here’s the issue that resulted from this investigation https://github.com/airbytehq/airbyte/issues/48502