Upgrading Airbyte Helm to version 0.50.48 with logging on GCS

Summary

Issue with worker replication-orchestrator failing due to state store missing status, resolved by setting CONTAINER_ORCHESTRATOR_ENABLED to false in worker configuration.


Question

Hello team,
For those who that might help, upgrading to latest Airbyte Helm (version Airbyte: 0.50.48) and having our logging on GCS.

We had this generic issue:
io.temporal.failure.ApplicationFailure: message='io.temporal.serviceclient.CheckedExceptionWrapper: io.airbyte.workers.exception.WorkerException: Running the launcher replication-orchestrator failed', type='java.lang.RuntimeException', nonRetryable=false
Looking more deeply into the worker, we had this:
state store missing status
We fixed it by doing this:

  extraEnv:
    - name: CONTAINER_ORCHESTRATOR_ENABLED
      value: "false"```
Note that <https://github.com/dis-sid|dis-sid> said the same at the meantime:  <https://github.com/airbytehq/airbyte/issues/18040>

<br>

---

This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. [Click here](https://airbytehq.slack.com/archives/C021JANJ6TY/p1707754141248829) if you want to access the original thread.

[Join the conversation on Slack](https://slack.airbyte.com)

<sub>
["upgrading-airbyte-helm", "logging-gcs", "worker-replication-orchestrator", "state-store", "container-orchestrator-enabled"]
</sub>

This general (unfortunately) a problem with the bucket configuration where the Helm is not mapping the env variables correctly

https://docs.airbyte.com/deploying-airbyte/on-kubernetes-via-helm#external-logs there are instructions for S3 but maybe give you some hints. <@U0697SLH4TS> is working to update the documentation with GCS logs

I managed to get it working using GCP by adding the following 4 environment variables:
STATE_STORAGE_GCS_BUCKET_NAME: GCS Bucket where to save the orchestrator state.
STATE_STORAGE_GCS_APPLICATION_CREDENTIALS: Path to service account credentials file that was mounted using a volume.
CONTAINER_ORCHESTRATOR_SECRET_NAME: Name of the secret that was mounted.
CONTAINER_ORCHESTRATOR_SECRET_MOUNT_PATH: Path of the folder in which the secret was mounted.
Which gives the following in the worker part of my values file:

  - name: google-creds
    readOnly: true
    mountPath: /secrets/gcs-log-creds/<my_folder>
extraVolumes:
  - name: google-creds
    secret:
      secretName: google-credentials-secret
extraEnv:
  - name: STATE_STORAGE_GCS_BUCKET_NAME
    value: <my_bucket>
  - name: STATE_STORAGE_GCS_APPLICATION_CREDENTIALS
    value: /secrets/gcs-log-creds/<my_folder>/<my_secret_file>.json
  - name: CONTAINER_ORCHESTRATOR_SECRET_NAME
    value: google-credentials-secret
  - name: CONTAINER_ORCHESTRATOR_SECRET_MOUNT_PATH
    value: /secrets/gcs-log-creds/<my_folder>```

<@U042YPXU4DV> this worked for me
the key was those two CONTAINER_ORCHESTRATOR_SECRET_NAME, CONTAINER_ORCHESTRATOR_SECRET_MOUNT_PATH that I didn’t know of (maybe they should be documented here?

Also, do you mind if I post the solution on the issue?

Working on adding this shortly to the docs! Thanks for the input everyone.

Shoudn’t we fix it on the helm chart directly instead?

The team is working to refactor all variables for logs <@U05QB3UJK9T> hope in a few weeks it will be fixed.

Documentation for Airbyte Helm deployments has been updated to include external logging to a GCS bucket. https://docs.airbyte.com/deploying-airbyte/on-kubernetes-via-helm