Using Airbyte for API calls and data synchronization

Summary

Airbyte evaluation for bulk data synchronization to a data lake and making API calls on behalf of customers. Question regarding assuming customer credentials for API calls to third-party systems.


Question

Hello! :wave: I’ve been following this project for a few years and I’ve been really impressed with the work. Thanks so much!

Moving question from <#C01AHCD885S|ask-ai> to be sure that we are not missing something…We are evaluating Airbyte. We have two primary uses cases:

  1. The ability to bulk synchronize customer data to a data lake, which Airbyte clearly supports.
  2. Make API calls on behalf of customers against the authorized third-party service.

Consider a Jira integration as an example:

  1. Bulk synchronize all issues into a warehouse
  2. Create an issue inside a project
    Is it possible to assume the customer’s credentials and make API calls to third-party systems or make proxy API calls to third-party systems using Airbyte?

full context with Kapa answer: https://airbytehq.slack.com/archives/C01AHCD885S/p1715024669698129



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want to access the original thread.

Join the conversation on Slack

["airbyte", "api", "data-synchronization", "customer-credentials", "third-party-systems"]

Additional note - We’d be open to building our own authorization flows and passing credentials to Airbyte for bulk synchronization, if that’s possible too.

It seems like this may be possible by manually creating “sources” using this API? :thinking_face:

https://reference.airbyte.com/reference/createsource

And in case it wasn’t clear from that write-up, you can absolutely pass your own auth into the Airbyte connection over the API. We create/configure the source, create/configure the connection, and run the syncs all over the API.

The Airbyte webapp UI is basically just a wrapper for their API . . . so if you can do it through the UI, you should be able to do it through the API.

This was super helpful. You answered my question and more! Thanks so much!

The doc you linked is just a way to create a source from an available connector—but sources are for read, not write.

Airbyte is really an ETL platform, whereas you’re talking about a piece that’s really Reverse ETL. While I know they have some plans to support this in the future, it isn’t supported today.

You could potentially use this “create” logic inside a custom connector (using CDK code, not low-code/builder), but it’s obviously not really part of Airbyte at that point as much as a bolt-on to the code you’re running.

The way we manage this is that we actually authorize our own SaaS app, then pass those credentials to Airbyte for ETL. For Reverse ETL, we pass the same credentials to https://www.getcensus.com/reverse-etl|Census. In both cases those tools are completely transparent to the end-user. There are a couple cases where we just use our own in-system code as well (which is probably how I’d handle this case, since you aren’t really syncing the data back as much as just creating a single issue, I’m assuming with job details or such). We don’t even use the built-in scheduling in Airbyte; everything is kicked off by our app based on other events, so Airbyte and Census basically become simple task workers for their respective use cases.

Again, not the only way to solve the problem, but one that has worked well for us and may be another option for you.