Joining normalized tables in Snowflake with hashid

Summary

The user is trying to join normalized tables in Snowflake using hashids based on surrogate keys but is facing issues. They are unsure if they need to create a surrogate key first and map it to the subtables.


Question

Hey all, I have a few tables which are written to Snowflake using normalization so they up with one metadata table (NL) and a few subtables (from a nested jsons). I want to join these back together but lost how to do this. I understand the hashid in the subtables is based on the surrogate key of the main metadata table https://discuss.airbyte.io/t/snowflake-destination-what-is-the-hash-id-column-made-from/1935/5. Does this mean I need to create a surrogate key first and map to the subtables? e.g.
md5(cast(coalesce(cast(column1 as TEXT), '') || '-' || coalesce(cast(column2 as TEXT), '')))
In the subtables there are 2 hashids, one that is prefixed with the subtable name and one referring to the metadata table. I tried this but am unable to join them together (no rows returned). Any ideas?



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

["snowflake", "normalized-tables", "hashid", "surrogate-key", "join"]

Got a bit further but am still confused. I understand the surrogate key dbt commands here https://docs.airbyte.com/operator-guides/transformation-and-normalization/transformations-with-dbt but not sure whether to include the airbyte added columns or not in the hashing. Also, I have 2 hash functions, one in the main table and one in the flattened/ unnested sub tables, are these just hash ids of the columns of that specific table?

Ok, I think this might have been due since we migrated to destinations v2 so matching legacy hash ids seem to have been lost. I basically dropped and recreated and now use the nested jsons in the main export table