Struggling with egress proxy configuration for GCP deployment in docker-compose

Summary

Struggling to configure egress proxy for GCP deployment in docker-compose. Web platform is up but sources are not up-to-date. Need list of required domains/URLs called by the platform.


Question

Hey folks, struggling a big to configure my docker-compose GCP deployment to handle egress proxy config. I believe I was able to configure the docker config and Airbyte’s docker-compose.yaml with the proxy variables correctly, and have the web platform up and running. The sources available, however, are not the ones up-to-date, and the source docs are not being loaded. I’m wondering if anyone has a list of all the required domains / URLs called by the platform (aside from the ones called by the actual syncs).



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want to access the original thread.

Join the conversation on Slack

["egress-proxy-configuration", "gcp-deployment", "docker-compose", "web-platform", "source-docs", "required-domains", "urls"]

Thanks for the quick response <@U035912NS77>! They appear as though up-to-date, however they are not (see attached screenshot) so I’m unable to upgrade.

I saw on <Issues · airbytehq/airbyte · GitHub issue> there is a AirbyteGithubStore resource that might call the github repo for the updated list? I couldn’t find the URL, although I’m guessing it is just the public airbyte repo’s URL.

I’m waiting on this change to be made by the responsible team, but trying to get ahead and ask the community for any possible others known (thanks for the http://hub.docker.com|hub.docker.com)!

since the image versions are the same, I’m expecting that they are actually current, but it may not be able to pull the metadata to confirm

They’re not current unfortunately… The UI is lying

(you’ll see those external docs links when you hover over the image names)

you might be able to find something in the server logs though

I think they’re pulling the main ones from GitHub directly:
https://connectors.airbyte.com/files/generated_reports/connector_registry_report.html

locked down egress is gonna be a pain for this type of workload though, lol

(just thinking through all the different domains used for auth redirection and such)

tell me about it… unfortunately it’s the client’s constraint, not my choice.

awesome, best of luck and looking forward to hearing how it goes!

You can update the sources on your instance by going to Settings > Sources and clicking Update All.

I believe the docs shown are pulled from the local source, so I wouldn’t think there would be any other domains you have to allow

I guess that at least custom connectors (maybe the other ones too?) would come from <http://hub.docker.com|hub.docker.com>

basically anything loaded in cloud that is certified or community, and then custom connectors you have to either use dockerhub or you can set up another option when adding it

I’m agency and SaaS side . . . I feel you

I’ll update here for future reference once I successfully sync my planned sources

thanks for the help <@U035912NS77>!

One more while I’m at it . . . seems like there may also be some metadata pulled from Google Cloud Storage. It’s not in the code (since they’re using the Python SDK), but it would be <http://storage.googleapis.com|storage.googleapis.com>. Here’s where I’m seeing that:
https://github.com/airbytehq/airbyte/tree/49bb24684749210dac88b9d24919e64b859140f4/airbyte-ci/connectors/metadata_service

To be honest, I’m not 100% sure when the metadata service is used for these operations, but the readme there seems to indicate it could be.

Following the code flow from here, it looks more like DockerHub to me:
https://github.com/airbytehq/airbyte/blob/d26bd10c8756f5a607c2d0e89b43993853017798/airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/upgrade_base_image/commands.py|https://github.com/airbytehq/airbyte/blob/d26bd10c8756f5a607c2d0e89b43993853017798[…]/pipelines/airbyte_ci/connectors/upgrade_base_image/commands.py

But maybe metadata is separate? and I don’t know if those s3_build_cache_* values actually mean S3, or if maybe those are used with GCS in S3 compatibility mode? Basically, “it’s complicated” :joy:

As for the docs links, this is specified per connector for custom, but for the certified and community ones they all point to <https://docs.airbyte.com/*> . . . so if the server pulls those to re-serve them, that may be why they’re blank