Downgrading Redshift destination connector not working

  • Is this your first time deploying Airbyte?: No
  • OS Version / Instance: AWS ubuntu - focal - 20.04 - amd64 - server
  • Memory / Disk: you can use something like 8GB
  • Deployment: Docker
  • Airbyte Version: 0.35.51-alpha
  • Step: sync

Hi, I’ve been having some problems with the update of the Redshift destination 0.3.32. As shown by many issues in github such as All Syncs with redshift destination 0.3.32 fail rollback to version 0.3.28 resolves the issue · Issue #12265 · airbytehq/airbyte · GitHub . I decided to revert to 0.3.27, which was working previously. However, the same error is still showing:
function json_extract_path_text(super, “unknown”, boolean) does not exist
Then
JSON schema validation failed.

Debug info:
Airbyte Version: 0.35.51-alpha
Source: Google Sheets (0.2.9)
Destination: Redshift (0.3.27)

I think the normalization (dbt file) is not being generated again when I downgrade or upgrade. I am not sure about what to do to completely rollback the configuration.

I attach a one of the logs.
logs-379.txt (153.5 KB)

Did you try reset the data after rolling back to previous version?

This problem can happen if normalization detected your main json field _airbyte_data as VARCHAR, but it in reality it’s type is SUPER

It seems like something that should work. I modified the “Namespace Custom Format” and it works. But I do not want to erase the past data nor create another table.

I’d like to know if there is something like a config file I could delete or modify to have the same effect. The spreadsheets I’m loading don’t matter that much but I do not want to erase the Zendesk data which connector has the same problem (Redshift destination).

in really you cannot revert back
because new desitnation-redshift 0.3.32 converted you table _airbyte_raw_name field _airbyte_data to redshfit SUPER type (from VARCHAR)
if you downgrade back destination-redshift you will only add some mess to your system

Can you please send not only logs but all contents of airbyte_workspace:/data directory ?

you can pack this way:

docker run -ti --rm -v airbyte_workspace:/data -v `pwd`:/result ubuntu tar cfz /result/data.tar.gz /data/379

Hi,
Given this answer I talked with my team over possible solutions. I think we are going to reset from scratch during the weekend since it’s the simpler solution.

On the airbyte_workspace:/data directory. I can give it if you still require it for something, however I would have to delete some data as I saw it has some credentials and stuff like that inside the .tar file.

Thanks for your answers.

Andres if you only delete the raw table from your destination the sync should works again. But you’re going to loose the data already synced.

if it possible to would be cool to get /data to try to understand the source of this problem.
/data contains a lot of logs related to sync and normalization.
Yes you are right /data contains credentials in destination_config.json
this file must be removed before compression of archive.

One of the reasons why people complain about destination-redshift upgrade 0.3.29(or less)0.3.31+ can be next:

We have 2 closely coupled components: destination-redshift, base-normalization

After we have added SUPER type this 2 components has to be upgraded both:

destination-redshift 0.3.31+
base-normalization 0.1.77+ (airbyte platform v0.36.2-alpha+)

It means if you upgrade destination-redshift you also must upgrade Airbyte platform to be at least v0.36.2-alpha+

Thanks. I’ll talk with the devops team to upgrade the platform.

About the archive. It seems I can’t upload a zip file here. Do you have a mail I can send it to?

Can you please upload to this github issue

?

Please remove sensitive data from archive

Done.

Thanks a lot

[Word limit filler]

I have looked at logs and I see next versions:

airbyte/destination-redshift:0.3.28
airbyte/normalization:0.1.69

these are old versions (which are not supported SUPER) but you already convered raw table to SUPER type.

Anyway just try to upgrade platform and airbyte/destination-redshift to the latest version and run sync again. If you again encounter the problem please upload data folder to github for investigation.

Thank you