Snowflake destination - What is the HASH ID column made from?

When syncing to Snowflake, we get a column called _AIRBYTE_<TABLE_NAME>_HASHID. I’m assuming it might be MD5? How is it generated?

Hey it is made for the whole row. Ideally, it’s also an identifier for that row.

Hi @harshith,

Thanks for the quick reply. I know it’s an identifier for that row. My question was more on how is it generated? Is it a random hash? I’m trying to find in the code where that string is generated for a row.

Is it by concatenating all values, then applying md5 to it? Do you know where pin-point in the code where it’s implemented?

Hey my bad. It’s md5. Yeah here you can also read through the docs https://docs.airbyte.com/understanding-airbyte/basic-normalization#normalization-metadata-columns

Also https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/bases/base-normalization all of the normalisation happens here.

1 Like

Hi @philippeboyd , did you find the how questions?
Is it a random hash, or string concat from all column of that field?
I tried to read through the Normalization Code, but can’t find it!

Hi @phucdinh, it is a string concatenation of all columns hashed with md5 (surrogate_key function from dbt_utils package).