I’m getting the exact same error with a very similar deployment on AWS EKS, using custom S3 logging. This was working with our previous version of Airbyte, v0.38.3-alpha, and started failing after the upgrade to v0.40.14. A resolution that does not involve reverting back to MINIO would be well appreciated here.
Hey there, I previously created an issue requesting better documentation on these config options, please add a thumbs up and comment any other info you’d like to add: https://github.com/airbytehq/airbyte/issues/17649
state: ## state.storage.type Determines which state storage will be utilized. One of "MINIO", "S3" or "GCS" storage: type: "S3" logs: ## logs.accessKey.password Logs Access Key ## logs.accessKey.existingSecret ## logs.accessKey.existingSecretKey accessKey: password: "" existingSecret: "" existingSecretKey: "" ## logs.secretKey.password Logs Secret Key ## logs.secretKey.existingSecret ## logs.secretKey.existingSecretKey secretKey: password: "" existingSecret: "" existingSecretKey: "" ## logs.storage.type Determines which log storage will be utilized. One of "MINIO", "S3" or "GCS" ## Used in conjunction with logs.minio.*, logs.s3.* or logs.gcs.* storage: type: "s3"
## logs.minio.enabled Switch to enable or disable the Minio helm chart minio: enabled: false
## logs.externalMinio.enabled Switch to enable or disable an external Minio instance ## logs.externalMinio.host External Minio Host ## logs.externalMinio.port External Minio Port ## logs.externalMinio.endpoint Fully qualified hostname for s3-compatible storage externalMinio: enabled: false host: localhost port: 9000
## logs.s3.enabled Switch to enable or disable custom S3 Log location ## logs.s3.bucket Bucket name where logs should be stored ## logs.s3.bucketRegion Region of the bucket (must be empty if using minio) s3: enabled: false bucket: airbyte-dev-logs bucketRegion: ""
We are deploying it with kustomize - I provided link above to the airbyte documentation which discusses the environment variables in context of the kustomize config files. Unfortunately, the example with Helm variables would not apply to us. Could you provide an example with kustomize configs here airbyte/kube/overlays/stable at master · airbytehq/airbyte · GitHub. Thank you!
I was having the same issue. I’m using helm and am not super familiar with kustomize, but hopefully this helps. I had to set a couple more values in my values.yaml file to get it to work.
global:
# ...
logs:
accessKey:
password: <access_key_id>
# Downstream charts don't use the secret created by the password above, so we need to pass in the secret info ourselves
existingSecret: <helm_release_name>-airbyte-secrets
existingSecretKey: AWS_ACCESS_KEY_ID
secretKey:
password: <secret_access_key>
# Downstream charts don't use the secret created by the password above, so we need to pass in the secret info ourselves
existingSecret: <helm_release_name>-airbyte-secrets
existingSecretKey: AWS_SECRET_ACCESS_KEY
Dug in to the code and found out that basically the airbyte-worker and airbyte-server deployment.yaml files only set the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables if the existingSecret and existingSecretKey are set, or if minio or externalMinio is enabled. There’s nothing there if I’m just passing in the password myself.
For your situation, I assume the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY aren’t being set properly on the worker/server for some reason. Hope that helps!
We’re also attempting to upgrade to 0.40.22 with kustomize and run into the exact same problem with the worker as stated here. We’ve been using S3 for logging instead of Minio as well.
What should be the course of action here? Stay stuck to a version before the WORKER_* vars were introduced like 0.40.6? @sh4sh any clue?
This is still an issue with using kustomization overlays for version 0.40.26. Oddly, the helm chart works correctly (for this, there are other things that are broken which is why I’m trying kustomization) so there’s probably a workaround.
This is still an issue with using kustomization overlays for version 0.40.26. Oddly, the helm chart works correctly (for this, there are other things that are broken which is why I’m trying kustomization) so there’s probably a workaround.
I have confirmed a workaround to get this fixed in version 0.40.26. In order to configure S3 logs correctly using the kustomization overlays, you need to follow the instructions found hereas well as set WORKER_LOGS_STORAGE_TYPE=S3. Note that WORKER_STATE_STORAGE_TYPE needs to remain unchanged.
We are using Kustomize and our airbyte version is 0.40.23. The issue we are seeing is that we failed to set custom s3 as a state storage bucket. The workaround right now is to turn on mino back just for state information.
I put a fix before based on the limited knowledge I have.
Hi - I’m still having some trouble with this and wondered if you could confirm your set up
Env overlay
S3_LOG_BUCKET=<your_s3_bucket_to_write_logs_in>
S3_LOG_BUCKET_REGION=<your_s3_bucket_region>
# Set this to empty.
S3_MINIO_ENDPOINT=
# Set this to empty.
S3_PATH_STYLE_ACCESS=
WORKER_LOGS_STORAGE_TYPE=S3
# leave as is, for me, defaults to MINIO
# WORKER_STATE_STORAGE_TYPE=
These configuration changes solved the issue for us (note that we are using the k8s manifests directly, not the helm chart):
In .env, the env var GCS_LOG_BUCKET needs to be set to the log bucket and the additional variable called STATE_STORAGE_GCS_BUCKET_NAME needs to be set to the state storage bucket. As far as I can tell, STATE_STORAGE_GCS_BUCKET_NAME isn’t documented, but you can see that it is part of the GCS configuration block for the workers: airbyte/application.yml at 7676af5f5fb53542ebaff18a415f9c89db417055 · airbytehq/airbyte · GitHub . The Minio/S3 variables for us are mostly nulled out, so the config variables for logs and storage largely look like so:
Secondly, the manifests for the workers need to be modified to actually pass the GCS state bucket variables as they currently do not. In the airbyte-worker deployment (airbyte/worker.yaml at master · airbytehq/airbyte · GitHub), we added the following vars (note that GOOGLE_APPLICATION_CREDENTIALS are reused here, but it is probably better to have a separate SA credentials for writing state):
Currently on 0.40.30, though the version is a bit irrelevant in my case - I use the manifests defined here: airbyte/kube at master · airbytehq/airbyte · GitHub, with the modification to worker.yaml from my post above. I never had an issue with GCS logging, but an issue with workers writing state to GCS - because the state bucket and creds are not getting passed in the worker deployment configs.
As far as I can tell, as of the latest commit on master, the manifests still have that issue and require the worker modification posted above for the deployment to function on GCP.