Issue with Airbyte install and destination verification

Summary

The user is experiencing issues with a new Airbyte install (chart 0.524.0) while trying to verify a destination. There are odd messages in the pod related to the destination check, and they are unsure of the cause of the issue or its relation to the 504 error in the UI.


Question

re: https://airbytehq.slack.com/archives/C01AHCD885S/p1725522809127229

I am experiencing issues with a new airbyte install (chart 0.524.0) , running (abctl v0.14.1) and trying to verify a destination

• I have verified postgres connectivity from the instance using psql running on the same EC2 instance as the k8s pods and there is also zero evidence of a postgres destination connection failure in the sum total of all the k8s logs
• I have enough resources - airbyte has been freshly installed on an otherwise bone-idle EC2 instance with 16GB of RAM (AWS t3.xlarge) instance type
• postgres sources are verifying just fine
Per the linked message there are odd messages in the pod that appears to relate to the destination check:

“2024-09-05T07:38:35.642324378Z pool-3-thread-1 ERROR Recursive call to appender SecretMaskRewrite”

but I have absolutely no clue whatsoever what has caused this issue or whether it is related to the 504 error experienced in the UI although it does seem possible based on the circumstantial evidence in the logs



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

["airbyte-install", "destination-verification", "postgres", "kubernetes", "504-error"]

also is it possible to configure the timeout of the nginx ingress controller using abctl or do I need to manually patch it after installation by abctl?

<@U07FH2Y34A1>

so, this the request that was getting a 504

10.11.13.203 - - [05/Sep/2024:07:39:23 +0000] "POST /api/v1/destinations/check_connection HTTP/1.1" 504 562 "<http://airbyte.lambert.upowr.internal:8000/workspaces/218abdf9-17ec-4c34-b96f-0b690f413601/destination/c9d91ec7-d42d-494e-b2b7-b7d1c1ff793f>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36" 1402 60.003 [airbyte-abctl-airbyte-abctl-airbyte-webapp-svc-http] [] 10.244.0.9:8080 0 60.000 504 03cd5690f2a22d6913abe65976be0d7a

Once I fixed the proxy timeout, I was able to see the actual error message via the same endpoint. There was no trace of this error message in any of the k8s logging, AFAICT - you need the end to end browser connection to work flawlessly without any timeouts in intervening layer to have any hope at all of the seeing the message because the message is not logged by the destination check pod itself - it just logs its own success not the failure of the downstream connection.

more context here:

https://github.com/airbytehq/airbyte/issues/45156

Ok, thanks for the info, and thanks for filing an issue

We’ll look at tweaking the nginx timeout in abctl

The UI didn’t show anything useful about this error?

After a lot of fiddling, I raised the proxy_read_timeout on the ingress controller to 120 and then I was able to see the root cause of the issue which was:

State code: 28000; Message: FATAL: no pg_hba.conf entry for host "10.11.65.17", user "u_lambert_rw", database "lambert_warehouse", no encryption```
there are two problems here:

• the pod that does the checking doesn’t log any errors relating to the status of the check - it simply logs a zero to indicate that the mechanics of the check itself did not fail.
• the timeout funnel between the webapp and the job that is checking is inverted, so that the webapp timesout before the destination check does
Since the destinationc check does not log any failure information at all to its own log and ingress proxy terminates the connection before error response is logged. any information regarding a timing out destination is lost forever until such time as the end user works out how to reconfigure the timeout on the ingress proxy.

For information of others, this is done with:

`kubectl -n airbyte-abctl edit <http://ingresses.networking.k8s.io|ingresses.networking.k8s.io> ingress-abctl`

and adding these annotations to the top-level metadata:

```apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/proxy-read-timeout: "120"```

Two I can see ways to fix this:

• allow the destination check to actually log any errors it gather instead of blithely assuming that the client session has been terminated by the downstream proxy
• adjust the timeout funnel so that there is some chance the error response will actually be delivered

Which endpoint were you hitting to (eventually) discover the error?

(The referrer URL of the above, indicates what page that I as a visitor visited, the 504 is the page generated request that was failing)