Should the catalog update when a new image of a source is used?

Recently, I was troubleshooting a dbt normalization that was failing with the following error:

2022-05-05 17:38:22 normalization > 17:38:14.751587 [error] [MainThread]: Database Error in model campaigns (models/generated/airbyte_tables/source-name/campaigns.sql)
2022-05-05 17:38:22 normalization > 17:38:14.753275 [error] [MainThread]:   invalid input syntax for type double precision: ""
2022-05-05 17:38:22 normalization > 17:38:14.754388 [error] [MainThread]:   compiled SQL at ../build/run/airbyte_utils/models/generated/airbyte_tables/source-name/campaigns.sql

I found a reference in the schema json files that posited type number for something that is properly construed as type string so I updated the schema file, created a new image, pushed that to the repo and then updated the custom source in Airbyte. The error persisted and, when I checked the logs, I found that the AirbyteCatalog for the campaigns stream still had the field I had changed the schema for as a number rather than string. I verified the files in the connector image reflected my change and tried retesting the source to see if I could get it to pick up the changed schema file. Nothing worked so I set up a second instance of the source using the same image as my existing source was using and that new source picked up the change to the schema and the transformations work without error.

(I also re-ran the connector using the originally configured source and same image as the recreated source and verified that the transformation error still occurs in the original so I have two sources, feeding into the same destination, running the same version of the source image, differing only in the version of the source connector with which they were initialized, but resulting in different catalogs and therefore outcomes.)

Is this expected behavior, i.e. that the catalog should only reflect the state when the source was initialized? If so, is there a way to reinitialize the catalog when a change in the image needs to be adopted? If not, is there an existing issue I should watch? (I did not find one in Github.)

Hi @techxorcist,

tried retesting the source to see if I could get it to pick up the changed schema file.

What test did you run? Did you try the “refresh source schema” button on the connection configuration page?

Is this expected behavior, i.e. that the catalog should only reflect the state when the source was initialized?

Yes it is. But we are working on a schema evolution feature as this is something highly requested from the community.

We have several issues on this topic that you can follow:

Thanks @alafanechere.

I was not triggering ‘refresh source schema’ when testing the revised connector source to see if it was taking up my change to the schema. Selecting that did make the original connector work without the normalization error.

I’ve subscribed to the schema evolution issues linked to and will stand by as work on those proceeds.

Thanks for your helpful response.