Exploding PostgreSQL jsonb field into columns

Summary

Exploring the possibility of transposing object keys inside a jsonb field into individual columns in the destination for a postgres -> postgres connection.


Question

Hi folks!

Q: Is it possible to explode a postgresql jsonb field into columns, like your Segments/Rudderstacks do?

Full context
We have a postgres -> postgres connection (latest airbyte & connector versions) with a fairly typical event table (a view that mirrors the underlying table) with a jsonb event.data column with an object schema that changes depending on the event type.

We’re attempting to ‘transpose’ the object keys inside the event.data field into individual columns in the destination. The destination is being consumed by Tableau/Power BI type tools, which don’t work with string json representations very well/at all.

We have several dozen event types emitted (and growing), so manually defining each object key -> column in the source view isn’t very scalable.

I’ve read <https://docs.airbyte.com/using-airbyte/core-concepts/typing-deduping|this page> a few times, and I’m not sure if the information is relevant for our use case? I think the now deprecated basic normalisation might be? But I don’t know how to work with it, if it is.

We’ve started pursuing a dbt transformer route, but having asked a few LLM’s they assure me that that is a dead-end, saying that dbt requires a predefined model structure, and that it doesn’t support dynamic schema evolution based on incoming data.

Any guidance anybody can give us would be massively appreciated.



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want to access the original thread.

Join the conversation on Slack

["postgresql", "jsonb-field", "explode-columns", "airbyte-connector"]

Hey <@U0716PUCBSL>, I think what you’re describing would fall under the now-deprecated transformation feature. I have a similar table I’m ingesting and our plan for it is to explode the data after ingesting with Airbyte using dbt. We’re using Snowflake as our warehouse and are building models on top of the raw JSON data.

I’d be happy to be wrong, but that’s where I got to after digging for our use-case.