Joining normalized tables in Snowflake with hashid

slack-user-airbyte · June 15, 2024, 6:10am

Summary

The user is trying to join normalized tables in Snowflake using hashids based on surrogate keys but is facing issues. They are unsure if they need to create a surrogate key first and map it to the subtables.

Question

Hey all, I have a few tables which are written to Snowflake using normalization so they up with one metadata table (NL) and a few subtables (from a nested jsons). I want to join these back together but lost how to do this. I understand the hashid in the subtables is based on the surrogate key of the main metadata table https://discuss.airbyte.io/t/snowflake-destination-what-is-the-hash-id-column-made-from/1935/5. Does this mean I need to create a surrogate key first and map to the subtables? e.g.
md5(cast(coalesce(cast(column1 as TEXT), '') || '-' || coalesce(cast(column2 as TEXT), '')))
In the subtables there are 2 hashids, one that is prefixed with the subtable name and one referring to the metadata table. I tried this but am unable to join them together (no rows returned). Any ideas?

This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

_{["snowflake", "normalized-tables", "hashid", "surrogate-key", "join"]}

slack-user-airbyte · June 23, 2024, 6:15am

Got a bit further but am still confused. I understand the surrogate key dbt commands here https://docs.airbyte.com/operator-guides/transformation-and-normalization/transformations-with-dbt but not sure whether to include the airbyte added columns or not in the hashing. Also, I have 2 hash functions, one in the main table and one in the flattened/ unnested sub tables, are these just hash ids of the columns of that specific table?

slack-user-airbyte · June 23, 2024, 6:15am

Ok, I think this might have been due since we migrated to destinations v2 so matching legacy hash ids seem to have been lost. I basically dropped and recreated and now use the nested jsons in the main export table

Topic		Replies	Views
Snowflake destination - What is the HASH ID column made from? Q&A destination-snowflake , connectors	5	1021	May 18, 2023
Snowflake normalization gives NULL tables when un-nesting JSON blobs Connector Questions & Issues destination-snowflake , normalization , connectors	4	488	September 19, 2022
Salesforce -> Snowflake. Normalization doesn't respect table name rule creating for reserved words Connector Questions & Issues source-salesforce , destination-snowflake	5	278	July 14, 2022
Destination Snowflake: Dropping technical columns in Snowflake Connector Questions & Issues destination-snowflake	2	277	September 4, 2022
Schema name not populating as expected in Snowflake Connector Questions & Issues getting-started , data-loading , connectors	4	263	July 14, 2022

Joining normalized tables in Snowflake with hashid

Summary

Question

Related topics