Handling Severed Nodes/Pods in Airbyte on Kubernetes Cluster

slack-user-airbyte · September 13, 2024, 6:13am

Summary

Inquiring about modifying heartbeat and timeout values in the Airbyte helm chart to handle severed nodes/pods more gracefully on a Kubernetes cluster.

Question

Hey all, I’m running airbyte on a (high availability, on prem) kubernetes cluster. We’re having occasional issues where a node goes down unexpectedly, and we can’t cancel any sync jobs that were using pods on that node. I looked over the heartbeat (3hr for source pods)/timeout (24hr for destination pods) docs, I’m wondering if anyone’s found a way to modify those values in the helm chart, or if there’s any way to set airbyte to handle severed nodes/pods more gracefully?

This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

_{["airbyte", "kubernetes-cluster", "helm-chart", "heartbeat", "timeout", "severed-nodes", "pods"]}

slack-user-airbyte · September 13, 2024, 6:22am

What version are you using? We switched architectures for controlling job pods a while back, such that we have a controller service (the workload-api-server) that is the source of truth for job pods. Assuming you are using the new architecture, then you should be able to cancel through the UI/API while the nodes themselves are down

slack-user-airbyte · September 13, 2024, 6:22am

Setting the following env vars will enable this architecture
WORKLOAD_LAUNCHER_ENABLED: true WORKLOAD_API_SERVER_ENABLED: true

Topic		Replies	Views
Destination, orchestrator, and source pod all stuck in terminating Connector Questions & Issues kubernetes	9	888	September 14, 2022
Deploy Kubernetes(EKS): Unable to launch airbyte using helm charts Platform, Deploy & Infra Issues getting-started , deploy	5	1237	November 7, 2022
Airbyte Orchestrator Job Scheduling Timeout with Karpenter.sh on EKS Platform Questions platform , question , eks , airbyte-orchestrator , karpentersh	12	185	September 24, 2024
Deploy Kubernetes: Unable to launch airbyte using helm charts Platform, Deploy & Infra Issues getting-started , kubernetes , deploy	7	2164	July 14, 2022
Node Selectors on Helm Chart Jobs not working + Auth problems Platform, Deploy & Infra Issues kubernetes , helm	1	838	May 24, 2023

Handling Severed Nodes/Pods in Airbyte on Kubernetes Cluster

Summary

Question

Related topics