Data migration from MSSQL to Postgres with schema preservation issue

slack-user-airbyte · June 28, 2024, 6:11am

Summary

Issue with schema preservation during data migration from MSSQL to Postgres using Airbyte connector

Question

Hello community,

My AIM: To migrate all existing data from MSSQL to Postgres (keeping the schema unchanged)

I deployed Airbyte on a linux virtual machine using docker compose official documentation.
Then I created a connection between Source - MSSQL database & Destination - Postgres Database.
Both the mssql and postgres have the same tables and schema initially, the mssql database has all the data and the postgres has no data.
Then I started the sync and all the data got copied in postgres database but the schema got changed in the postgresql tables , for example - I had few columns of type ‘uuid’ which got changed to varchar after the migration. Many other columns also got changed to varchar.
I want to migrate all the data ensuring the schema in the destination is not changed at all.
Please help!!

Thanks

This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

_{["data-migration", "mssql", "postgres", "schema-preservation", "airbyte-connector"]}

slack-user-airbyte · July 25, 2024, 6:17am

This behavior is fairly common to handle differences between platforms for certain data types.

Each Airbyte source has a data type mapping to an internal “Airbyte Type” which is then mapped out in the destination. So in your case you’d want to look at the <Postgres | Airbyte Documentation mapping for the MSSQL source> and then the <Postgres | Airbyte Documentation tables data type mapping for the Postgres destination>. Almost any type that isn’t explicitly listed on a source’s data type mapping (UUID in this case) will be treated as an Airbyte string type as is noted on the source page:
> If you do not see a type in this list, assume that it is coerced into a string. We are happy to take feedback on preferred mappings.
It seems like Airbyte tries to keep a very minimal list of internal types to simplify the mappings between systems, so I would personally think it unlikely that they would introduce an internal type as specific as UUID . . . but they may be willing to discuss whether it should be treated as binary such (since a UUID is ultimately just a 128-bit binary value).

In this case I would personally expect to apply some sort of downstream transformation, or work off the raw tables to produce an appropriate field type.

While not ideal on very large tables/joins, it’s also worth noting that—as long as the column is indexed appropriately—treating UUIDs a strings is unlikely to produce a massive performance issue.

Topic		Replies	Views
Data type transformation issue from MS SQL to PostgreSQL Connector Questions airbyte-connector , connector , question , postgresql , ms-sql	0	43	July 13, 2024
Retaining column types in Airbyte sync setup Connector Questions airbyte , connector , question , json-schema , csv-source-connector	0	51	June 8, 2024
Preserving Column Types in Postgres to Postgres Replication Connector Questions airbyte-connector , connector , postgres , replication , question	0	46	May 14, 2024
Mapping SQL schema in Airbyte Connector Questions airbyte , connector , question , primary-key , sql-schema	1	73	May 14, 2024
Error syncing from Postgres SQL to MSSQL due to column data type mismatch Connector Questions connector , error , question , mssql , postgres-sql	0	52	June 13, 2024

Data migration from MSSQL to Postgres with schema preservation issue

Summary

Question

Related topics