CSV Files source special settings needed to sync to MS SQL

  • Is this your first time deploying Airbyte?: Yes :+1:
  • OS Version / Instance: Almalinux
  • Memory / Disk: 4Gb / 20 GB
  • Deployment: Docker
  • Airbyte Version: v0.39.30-alpha
  • Source name/version: Files CSV
  • Destination name/version: MS SQL 2016 on Windows
  • Step: sync
  • Description: syncing does not work and throws various errors - depending on my setting for the CSV file which is: https://opendata.dwd.de/climate_environment/CDC/regional_averages_DE/annual/sunshine_duration/regional_averages_sd_year.txt .

So of course I went to read_csv in order to set render options, but I did not succeed eventually…
There are at least those problems with the file:

  • first line is a comment - there is "skiprows": "1" for this
  • second column has the same name as the first column - so I use "usecols":"[0,2,3,4,5, ...]"
  • selecting normalization throws an error from MS SQL

What exactly would I need to set as settings in order to be able to sync this open data CSV table to the database?

Best Regards

(upgraded to 0.39.32-alpha)
This seems also to be something that makes run basic normalization with SQL Server into problems - log file attached.
To me, it seems that although I set { "sep" : ";", "skiprows" : "1", "skipinitialspace":"true", "usecols": [0,3,4,5], "engine": "python"} in reader options for CSV, the column ‘year’ still makes it run into problems. This also happens when setting "usecols": [3,4,5] which should omit this column… :woman_shrugging:
Column 'year' has a data type that cannot participate in a columnstore index. Omit column 'year'. (35343).

But what is also not really running smooth is the tmp tables column naming: those are named according to the first data line, starting with 1952:

Maybe / is not allowed in a column name - that would make sense, but it could simply skip invalid characters there…?
logs-35.txt (197.1 KB)

Hey to understand this more is the basic normalisation failing for this sync? If so could you share the logs here?

Also on the other hand could you check/post here one record from the raw_table

Hi, it seems to be a combination of problems not getting data and failing basic normalization:

  • both tables _airbyte_raw_airbyte_ELT_sunshine_duration_de_annual and airbyte_elt_sunshine_duration_de_annual__dbt_tmp don’t contain data.
  • basic normalization log is included in the log I posted, example at line 1323 and the following lines.
  • here is an example of the first three lines of data from the source (it is publicly accessible, so anybody could test this):

Zeitreihen fuer Gebietsmittel fuer Bundeslaender und Kombinationen von Bundeslaender, erstellt am: 20220702 Jahr;Jahr;Brandenburg/Berlin;Brandenburg;Baden-Wuerttemberg;Bayern;Hessen;Mecklenburg-Vorpommern;Niedersachsen;Niedersachsen/Hamburg/Bremen;Nordrhein-Westfalen;Rheinland-Pfalz;Schleswig-Holstein;Saarland;Sachsen;Sachsen-Anhalt;Thueringen/Sachsen-Anhalt;Thueringen;Deutschland; 1951;year; 1892.6; 1893.9; 1711.6; 1712.5; 1624.3; 1904.0; 1643.4; 1645.9; 1567.6; 1650.8; 1738.5; 1751.1; 1842.9; 1685.9; 1662.0; 1632.0; 1711.7; 1952;year; 1653.4; 1654.0; 1712.1; 1577.7; 1517.5; 1714.3; 1484.9; 1486.5; 1432.2; 1589.8; 1613.2; 1746.0; 1607.6; 1562.6; 1526.4; 1480.8; 1576.7; 1953;year; 1919.9; 1919.1; 1906.9; 1889.9; 1765.0; 1876.9; 1609.5; 1610.1; 1562.3; 1716.7; 1694.1; 1917.5; 1892.9; 1740.7; 1741.2; 1741.9; 1783.9;

By the way, I think I had failing normalization with this kind of error message in another case also with MS SQL.

Best Regards

this should help with details about failing normalization:
I set as source the same MS SQL DB as the destination and selected a tiny table to sync. The result is that this fails also, see logs attached, example
The statement failed. Column 'column1' has a data type that cannot participate in a columnstore index. Omit column 'column1'. (35343)

Maybe it has to do with the old MS SQL 2016 that I am using?

logs-47.txt (278.1 KB)

Hey here I have created an issue Destination MySQL: failing with cannot participate in a columnstore index · Issue #14493 · airbytehq/airbyte · GitHub around this so that team can look into it

1 Like

ah, sorry to bother, but it’s not MySQL, it is indeed MS SQL in this case :innocent:

Hi there from the Community Assistance team.
We’re letting you know about an issue we discovered with the back-end process we use to handle topics and responses on the forum. If you experienced a situation where you posted the last message in a topic that did not receive any further replies, please open a new topic to continue the discussion. In addition, if you’re having a problem and find a closed topic on the subject, go ahead and open a new topic on it and we’ll follow up with you. We apologize for the inconvenience, and appreciate your willingness to work with us to provide a supportive community.