Reducing size of airbyte_internal schema in Redshift

Summary

Inquiring about reducing the size of the airbyte_internal schema in Redshift


Question

Hi Team!

I am currently using airbyte to ingest the data from various relational databases into cloud warehouse(redshift). Earlier today, I was checking the table size and row counts of tables in the cluster and found that “airbyte_internal” schema takes more space compared to others.

Is there anyway that this can be reduced or data can be truncated in this airbyte_internal schema? Kindly advise.



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

["airbyte-internal-schema", "redshift", "data-ingestion", "table-size", "row-count"]

Hi hareesh. We create raw tables in airbyte_internal schema for typing and deduping the final table you see. Relevant https://docs.airbyte.com/using-airbyte/core-concepts/typing-deduping|docs. We keep historical data in raw table to enable schema evolution if source schema changes in the future.