Advice on CI/CD for Connector Development

Summary

User seeks guidance on version control and CI/CD practices for developing internal source connectors in Airbyte, specifically for REST APIs and custom connector forks.


Question

:wave: i am new to airbyte connector development… in my organisation, we anticipate building quite a few internal source connectors (e.g. for internal REST APIs or our opinionated forks of some connectors). Any advice, prior art or case studies around version control and CI/CD for connector source code and configs in such a case?



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

['connector-development', 'ci-cd', 'version-control', 'internal-source-connectors']

One tip is to use the Connector Builder when possible, as it makes maintaining connectors much easier (especially in a mixed team). As a programmer, I was very skeptical of it at first . . . but we’ve moved all our CDK connectors to Builder (save one, which is only just recently possible now that they added XML types). If there are low-code features not yet supported in builder, it allows you to toggle it to YAML mode and implement them that way (so effectively Builder is just a fancy editor for the low-code YAML connectors). This also means people on your team don’t need a dev environment set up to update connectors, which is handy.

There are still corner cases for using CDK connectors, but this is increasingly small. And Builder connectors have the advantage that they automatically inherit platform-level performance improvements and such, whereas CDK images are locked until you build them again against a newer CDK version.

So my preference to minimize maintenence pain and allow the non-developer types the ability to add fields or endpoints on their own is always Builder > Low-Code > CDK.

There is also a CI/CD component and some instructions that exist around it . . . if you give it a quick search in Slack, you’ll find several discussions around it from the Airbyte folks.

There is good article https://medium.com/israeli-tech-radar/docker-image-for-building-custom-airbyte-connectors-07ef41685a9d|here on building custom connector images. And once you have built your connector images then you can deploy them using the configuration API (https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/source_definitions/create_custom|/v1/source_definitions/create_custom and https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/source_definitions/update|/v1/source_definitions/update). Both these steps should be straight forward to run from your CI/CD.

<@U065RJ879QT> i hope to work with you on getting a good pattern for CDK going. At the moment we are in a proof-of-concept evaluation so getting proper CI/CD around custom connector not yet a priority

there is a lot that I like about Builder as an “edtor” but to call it an IDE, I would like to hear more about how as a tool it accomodates version control, testing, and enables devops princples… basically am i sleeping well at night to have production workloads in our data stack, with Builder as our primary tool?

and thanks so much for bringing up terraform, i have questoins about it but they are not yet urgent

> basically am i sleeping well at night to have production workloads in our data stack, with Builder as our primary tool?
Our approach is to use Connector Builder only as an environment for building/testing connectors. When we are ready to run in production we export the raw manifest from Connector Builder and commit it to a corresponding connector folder (i.e.: airbyte-integrations/connectors/&lt;some-custom-connector&gt;) in our fork of the <https://github.com/airbytehq/airbyte|airbyte repo>. Then all our version control, testing, CI/CD etc is built around this repo - and our custom connectors in production are always deployed from docker images, not directly from Connector Builder.

<@U07L03FCPA5> that aligns with a lot of my thoughts, i might have some more questions for you later :sunglasses:

One tip is to use the Connector Builder when possible, as it makes maintaining connectors much easier (especially in a mixed team). As a programmer, I was very skeptical of it at first . . . but we’ve moved all our CDK connectors to Builder (save one, which is only just recently possible now that they added XML types). If there are low-code features not yet supported in builder, it allows you to toggle it to YAML mode and implement them that way (so effectively Builder is just a fancy editor for the low-code YAML connectors). This also means people on your team don’t need a dev environment set up to update connectors, which is handy.

There are still corner cases for using CDK connectors, but this is increasingly small. And Builder connectors have the advantage that they automatically inherit platform-level performance improvements and such, whereas CDK images are locked until you build them again against a newer CDK version.

So my preference to minimize maintenence pain and allow the non-developer types the ability to add fields or endpoints on their own is always Builder > Low-Code > CDK.

There is also a CI/CD component and some instructions that exist around it . . . if you give it a quick search in Slack, you’ll find several discussions around it from the Airbyte folks.

There is good article https://medium.com/israeli-tech-radar/docker-image-for-building-custom-airbyte-connectors-07ef41685a9d|here on building custom connector images. And once you have built your connector images then you can deploy them using the configuration API (https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/source_definitions/create_custom|/v1/source_definitions/create_custom and https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/source_definitions/update|/v1/source_definitions/update). Both these steps should be straight forward to run from your CI/CD.

Yeah there are great news and meh news.

• Builder is a serious IDE already and we are betting on it and investing more. Internally, in a couple of quarters we will use Builder to maintain all API connectors, including the most complex ones.
• There are terraformable APIs to schedule your connectors to your clusters, but they don’t take raw builder manifests yet. They exist for custom images but not for builder manifests. We will address this, being loud about it helps prioritize.
• The CI tooling is pretty great if you’re at Airbyte proper. airbyte-ci has a LOT of assumptions about repo structure and only works well if you have a fork of airbyte repo and your tokens are setup correctly. It will fail in mysterious ways if you’re trying to make your own new repo for just one connector and use airbyte-ci from pypi.
◦ I am interested in improving it as we want to help our enterprise design partners build their connectors and run CI well. If you’re building a complex connector on top of CDK and need CI, I’m happy to pair. I’ve made a few attempts but never actually found time to do it well yet.

<@U065RJ879QT> i hope to work with you on getting a good pattern for CDK going. At the moment we are in a proof-of-concept evaluation so getting proper CI/CD around custom connector not yet a priority

there is a lot that I like about Builder as an “edtor” but to call it an IDE, I would like to hear more about how as a tool it accomodates version control, testing, and enables devops princples… basically am i sleeping well at night to have production workloads in our data stack, with Builder as our primary tool?

and thanks so much for bringing up terraform, i have questoins about it but they are not yet urgent

> basically am i sleeping well at night to have production workloads in our data stack, with Builder as our primary tool?
Our approach is to use Connector Builder only as an environment for building/testing connectors. When we are ready to run in production we export the raw manifest from Connector Builder and commit it to a corresponding connector folder (i.e.: airbyte-integrations/connectors/&lt;some-custom-connector&gt;) in our fork of the <https://github.com/airbytehq/airbyte|airbyte repo>. Then all our version control, testing, CI/CD etc is built around this repo - and our custom connectors in production are always deployed from docker images, not directly from Connector Builder.

<@U07L03FCPA5> that aligns with a lot of my thoughts, i might have some more questions for you later :sunglasses: