Troubleshooting custom connector deployment on Airbyte running on GKE cluster

slack-user-airbyte · July 9, 2024, 6:11am

Summary

User is facing HTTP Internal Server Error when trying to use a custom connector on Airbyte running on GKE cluster. The user has ensured cluster access to Google Artifact Registry.

Question

Hello everyone - My airbyte are running on gke cluster and I’m trying to use a custom connetor. For this I put the docker image on google artifact registry and try on airbyte ui add this connector and pass the information from my repo and image tag, however, the airbyte return a http.internalservererror. I already ensure that the cluster had access on artifact registry, but the error continues.
Anyone already did this or tryed this?

This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

_{["airbyte", "gke-cluster", "custom-connector", "docker-image", "google-artifact-registry", "http-internalservererror"]}

slack-user-airbyte · July 9, 2024, 6:16am

Have you checked logs from pods in gke?

slack-user-airbyte · July 9, 2024, 6:16am

How are you authenticating to Artifact Registry? And is the image in the same project?

Also, make sure that whatever service account GKE is running as (either the default or a custom one you supplied when the cluster was created) has the following IAM Role (or its component permissions):
roles/artifactregistry.reader

Generally that’s all it should take, but there are of course exceptions.

slack-user-airbyte · July 9, 2024, 6:16am

<@U05JENRCF7C> Yes, but can’t find any information there yet.

slack-user-airbyte · July 9, 2024, 6:16am

<@U035912NS77> About the authentication I created a service account and gave the artifact permissions and also apply this on cluster. The artifact is in the same project, actually I already use this on Airflow, with same authentication.
This is the error that I’m receiving:

slack-user-airbyte · July 10, 2024, 6:16am

Which Airbyte version are you using? In some older versions there was this issue that pod couldn’t start because of length limitations in labels or annotations (I don’t remember exactly)
Have you checked kubectl get events?

Still, I’d give another try in checking logs.
Sometimes I connect to cluster, ensure that kubectl works, and use stern tool
https://github.com/stern/stern
stern --tail 0 . then I click in user interface, and Ctrl+C to stop capturing more logs

slack-user-airbyte · July 11, 2024, 6:16am

<@U05JENRCF7C> I’m using version 0.63.4 right now. I don’t check the events, but will proceed now and also use this tool that you share. Tks a lot.

slack-user-airbyte · July 11, 2024, 6:16am

You may also want to check whether the service account set up on the GKE cluster (under the Security section) is the same as the one set in the serviceAccountName field of your values.yaml (if not, you may not be granting on the right account)

If you don’t see log entries related to the container image not being found or such, I would look specifically for an auth error in Cloud Logging—it’s possible that it either isn’t using the service account you’re expecting it to, or that there are additional grants that are missing.

slack-user-airbyte · July 12, 2024, 6:16am

<@U035912NS77> I already have a credential that is used to access the Artifact Registry and I trying to use the same, because my image is in the same repo.
I attached the log from worker pod when I try to push the image on UI.

slack-user-airbyte · July 12, 2024, 6:16am

if you look in <https://console.cloud.google.com/logs|Cloud Logging> around that time, do you see any auth/permission errors listed?

slack-user-airbyte · July 12, 2024, 6:16am

I checked the logs and can’t find any log with authentication or permission error related this or looking to worker pod. But I don’t know if I did correctly. Do you have an query example for this?

slack-user-airbyte · July 12, 2024, 6:16am

This is the message showed on worker pod, when running on UI:

Using existing AIRBYTE_ENTRYPOINT: python /airbyte/integration_code/main.py
2024-07-08T22:29:14.203911847Z Waiting on CHILD_PID 7
2024-07-08T22:29:14.204103586Z PARENT_PID: 1
2024-07-08T22:29:16.117158405Z EXIT_STATUS: 139

slack-user-airbyte · July 13, 2024, 6:15am

We’ve set the var JOB_KUBE_MAIN_CONTAINER_IMAGE_PULL_SECRET as suggested here https://docs.airbyte.com/operator-guides/using-custom-connectors/.

However, we got this error:

[map[name:JOB_KUBE_MAIN_CONTAINER_IMAGE_PULL_SECRET value:gcp-service-account] map[name:JOB_KUBE_MAIN_CONTAINER_IMAGE_PULL_SECRET valueFrom:map[configMapKeyRef:map[key:JOB_KUBE_MAIN_CONTAINER_IMAGE_PULL_SECRET name:airbyte-0-1720015847-airbyte-env]]] map[name:SECRET_PERSISTENCE value:&lt;nil&gt;]]
 doesn't match $setElementOrder list:```

slack-user-airbyte · July 13, 2024, 6:15am

<@U06SV3WK399> helm has issues to update resources sometimes. The fastest way for me was to delete worker deployment kubectl delete deployment ... and repeat helm install/upgrade. I suggest not having active synchronizations when doing that.

slack-user-airbyte · July 13, 2024, 6:15am

<@U05JENRCF7C> in which part of the helm chart you added this var? JOB_KUBE_MAIN_CONTAINER_IMAGE_PULL_SECRET . We are trying to add it under the worker part, in extraEnv.

slack-user-airbyte · July 13, 2024, 6:15am

You may also want to check the environment variables listed on the deployment, as most of the time that I see the The order in patch list . . . doesn't match $setElementOrder list it’s actually because the value is duplicated (i.e. already being merged in the templates, and doesn’t need to be passed in ExtraEnv). Not sure if that’s the case on this one, but worth checking

slack-user-airbyte · July 14, 2024, 6:18am

Cool, we were able to set the var, it’s working at least this part. But still getting this message from the pod when we try to pull the custom connection from artifact.

2024-07-08T22:29:14.203911847Z Waiting on CHILD_PID 7
2024-07-08T22:29:14.204103586Z PARENT_PID: 1
2024-07-08T22:29:16.117158405Z EXIT_STATUS: 139```

Topic		Replies	Views
Error adding custom source connector to Airbyte UI in Kubernetes Cluster Connector Questions helm , airbyte-ui , connector , bug , custom-source-connector	5	29	October 30, 2024
Issue with Google Artifact Registry and Image Pulling in Airbyte Platform Questions platform , question , custom-connector , docker-compose , google-artifact-registry	1	62	June 23, 2024
Developing on helm Chart Connector Questions & Issues connectors , kubernetes	5	595	July 14, 2022
Error when pulling Docker custom connector image in GKE Airbyte deployment from Macbook Connector Questions connector , question , exec-format-error , docker-custom-connector , gke-airbyte-deployment	2	16	September 6, 2024
Issue Airbyte on GKE without internet Platform, Deploy & Infra Issues kubernetes , deploy	11	1096	July 14, 2022

Troubleshooting custom connector deployment on Airbyte running on GKE cluster

Summary

Question

Related topics