Custom transformation for Sync mode Incremental - Deduped History do not create _scd tables

Hi, I have added custom transformation on top of the raw data to transform it.

I have choose Sync mode Incremental - Deduped History for the postgres destination connector. When the data was sync, the sync process did not create intermediate table i.e, history table which ends with _SCD.

Wanted to understand for the deduped incremental sync to work, is it mandatory to select Basic Normalization ?

Can you please suggest?

You need to keep Normalized tabular data enabled to run normalization and create the final tables. Is your dbt transformation a copy from the normalized tabular data output or a custom queries you want to run afters?

its basically copy of the normalized tabular data output. When I look at the generated model, it creates table with stream name having tranformations with CTE within the table script. I just created a dbt with these sqls without table creation, as dbt will create table with name of the model.

It looks like one has to keep or check Normalized tabular data enabled for the sync mode history to create intermediate tables.
Is my understanding right ?

If it’s a copy exported from the Airbyte normalization should works without the normalization enabled. But I didn’t understand what you change it.
Did you follow steps from this tutorial Transformations with dbt \(Part 2/3\) | Airbyte Documentation and you’re trying to run as a custom transformation?

Hey @marcosmarxm, as per the tutorial I have exported the generated normalization.

Executing deps…created dependencies in the /dbt folder
dbt deps --profiles-dir=. --project-dir=.

Executing run generated models successfully
dbt run --profiles-dir=. --project-dir=.

Created a github dbtproject with the exported normalization structure to check the customized transformation.

But it failed with the error:
2022-04-16 03:34:23 dbt > Running from /data/95/0/transform/git_repo

2022-04-16 03:34:23 dbt > detected no config file for ssh, assuming ssh is off.

2022-04-16 03:34:23 dbt > Running: dbt run --profiles-dir=/data/95/0/transform --project-dir=/data/95/0/transform/git_repo

2022-04-16 03:34:30 dbt > 03:34:30 Running with dbt=1.0.0

2022-04-16 03:34:30 dbt > 03:34:30 Encountered an error:

2022-04-16 03:34:30 dbt > Compilation Error

2022-04-16 03:34:30 dbt > dbt found 1 package(s) specified in packages.yml, but only 0 package(s) installed in /dbt. Run “dbt deps” to install package dependencies.

2022-04-16 03:34:31 INFO i.a.w.t.TemporalAttemptExecution(lambda$getWorkerThread$2):158 - Completing future exceptionally…

io.airbyte.workers.WorkerException: Dbt Transformation Failed.

at io.airbyte.workers.DbtTransformationWorker.run(DbtTransformationWorker.java:57) ~[io.airbyte-airbyte-workers-0.36.0-alpha.jar:?]

at io.airbyte.workers.DbtTransformationWorker.run(DbtTransformationWorker.java:16) ~[io.airbyte-airbyte-workers-0.36.0-alpha.jar:?]

I have tried adding additional transformation step to install deps

still the same error.
Could not find the location in docker container where the dbt deps installed and why run command could not find it ?
Can you please advise ?

Can you edit the file dbt_project.yml to:

target-path: "../build"  # directory which will store compiled SQL files
log-path: "../logs"  # directory which will store DBT logs
modules-path: "../dbt_modules"  # directory which will store external DBT dependencies

This should fix the problem.

As per your suggestion modified dbt_project.yml file, but the sync job failed with the following error:

2022-04-19 02:09:22 dbt > at path : Additional properties are not allowed (‘modules-path’ was unexpected)

2022-04-19 02:09:22 dbt >

2022-04-19 02:09:22 dbt > Error encountered in /data/111/0/transform/git_repo/dbt_project.yml

2022-04-19 02:09:22 dbt > 02:09:22 Encountered an error:

2022-04-19 02:09:22 dbt > Runtime Error

2022-04-19 02:09:22 dbt > Could not run dbt

2022-04-19 02:09:23 INFO i.a.w.t.TemporalAttemptExecution(lambda$getWorkerThread$2):158 - Completing future exceptionally…

io.airbyte.workers.WorkerException: Dbt Transformation Failed.

at io.airbyte.workers.DbtTransformationWorker.run(DbtTransformationWorker.java:57) ~[io.airbyte-airbyte-workers-0.36.0-alpha.jar:?]

at io.airbyte.workers.DbtTransformationWorker.run(DbtTransformationWorker.java:16) ~[io.airbyte-airbyte-workers-0.36.0-alpha.jar:?]

at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:155) ~[io.airbyte-airbyte-workers-0.36.0-alpha.jar:?]

at java.lang.Thread.run(Thread.java:833) [?:?]

Caused by: io.airbyte.workers.WorkerException: DBT Transformation Failed.

at io.airbyte.workers.DbtTransformationWorker.run(DbtTransformationWorker.java:54) ~[io.airbyte-airbyte-workers-0.36.0-alpha.jar:?]

… 3 more

The dbt_project.yml must be something like this:

# This file is necessary to install dbt-utils with dbt deps
# the content will be overwritten by the transform function

# Name your package! Package names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: "airbyte_utils"
version: "1.0"
config-version: 2

# This setting configures which "profile" dbt uses for this project. Profiles contain
# database connection information, and should be configured in the  ~/.dbt/profiles.yml file
profile: "normalize"

# These configurations specify where dbt should look for different types of files.
# The `source-paths` config, for example, states that source models can be found
# in the "models/" directory. You probably won't need to change these!
source-paths: ["models"]
docs-paths: ["docs"]
analysis-paths: ["analysis"]
test-paths: ["tests"]
data-paths: ["data"]
macro-paths: ["macros"]

target-path: "../build" # directory which will store compiled SQL files
log-path: "../logs" # directory which will store DBT logs
modules-path: "../dbt_modules" # <---HERE YOU MUST EDIT :)

clean-targets: # directories to be removed by `dbt clean`
  - "build"
  - "dbt_modules"

quoting:
  database: true
  # Temporarily disabling the behavior of the ExtendedNameTransformer on table/schema names, see (issue #1785)
  # all schemas should be unquoted
  schema: false
  identifier: true

# You can define configurations for models in the `source-paths` directory here.
# Using these configurations, you can enable or disable models, change how they
# are materialized, and more!
models:
  +transient: false
  airbyte_utils:
    +materialized: table
    generated:
      airbyte_ctes:
        +tags: airbyte_internal_cte
        +materialized: ephemeral
      airbyte_incremental:
        +tags: incremental_tables
        +materialized: incremental
        +on_schema_change: sync_all_columns
      airbyte_tables:
        +tags: normalized_tables
        +materialized: table
      airbyte_views:
        +tags: airbyte_internal_views
        +materialized: view

dispatch:
  - macro_namespace: dbt_utils
    search_order: ["airbyte_utils", "dbt_utils"]

I made the changes as suggested, but failed with the same error.

Attached logs for your reference.
logs-115.txt (36.6 KB)

Please check the custom transformation at
rnncredit/airbyte-trans-postgres (github.com)

Please advise

Sorry Narender for the long delay replying you.
Please take a look in my repo using custom dbt: https://github.com/marcosmarxm/airbyte-custom-normalization

I edited a few things:
https://github.com/marcosmarxm/airbyte-custom-normalization/blob/d759c8f3d1b465449d675b9e56f7210ab092ad94/dbt_project.yml#L17-L19


Let me know if need any further help