Data Transformation for SaaS App with ETL Tool

Summary

Seeking guidance on transforming and unifying data from various sources in a SaaS app using ETL tools like Airbyte for a Postgres → Postgres transformation.


Question

Apologies if this is the wrong channel for this, but seeking some advice/guidance.

I have a SaaS app that pulls data from various locations (Stripe/Shopify/custom PSPs) and am looking to unify the various tables that are already in my Postgres to a shared schema. I am using an ETL tool called Fiber to pull the data from the Stripe/Shopify endpoints into my local Postgres DB but still need to transform the data to be shared across the different tools (as my customers need the ability to add their own data to the a unified Table).

I am not a data engineer, but a web developer and am tasked with figuring out how to do this data transformation.

In Airbyte parlance, I believe I need to do a Postgres → Postgres transformation, where both the destination and the source are the same postgres URI. We have ~100M rows of order objects, with 10M/mo of writes so cannot really manage this data transformation without involving some third party tool I believe.

The end use case will be the merchant being able to hit our API directly to pull their data out as well as having to expose this data to other vendors, so the data needs to live in postgres (it’s not being used for data analytics/business insights, which I believe is the main use case of ETL tools).

Am I way overthinking this and there’s a simple way to perform & manage this data transformation or am I on the correct path in looking to use something like Airbyte?

I am trying to set this up in an afternoon but it seems like I need to learn a new programming framework to accomplish this? (cloud dbt)



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want to access the original thread.

Join the conversation on Slack

["saas-app", "data-transformation", "etl-tool", "postgres", "airbyte", "api", "data-engineer"]