Raw Table Naming Convention in Airbyte


Explanation of raw table naming convention based on the number of consecutive underscore characters in the concatenation of target namespace and stream name.


The reason I have two raw tables is because when using the sample sql generator in the <Upgrading to Destinations V2 | Airbyte Documentation to destinations v2 : additional steps for incremental sync modes> documentation, I used a simple stream name and assumed the double underscore separator would apply to all names.
It turns out that the separator is based on the number of consecutive underscore characters in the simple concatenation of target(?) namespace and stream name (or alias?) plus 1. (I’m pretty sure the namespace and name have already been sanitised/non-alphanumeric converted to underscore)
ie, public.custom_salesforce_object__c has raw table public_raw___stream_custom_salesforce_object__c (triple underscore)
It also means namespace_._name will have a raw table with a triple underscore delimiter.
I double checked, and I still can’t see this mentioned in the upgrade documentation, or the reasons for it (although the code comments https://github.com/airbytehq/airbyte/blob/7e4bf436235b3e651623e1b49b069655c10fe373/airbyte-integrations/bases/base-typing-deduping/src/main/java/io/airbyte/integrations/base/destination/typing_deduping/StreamId.java#L72|here do).

This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want to access the original thread.

Join the conversation on Slack

["raw-table", "naming-convention", "incremental-sync", "documentation", "namespace", "stream-name", "separator"]