Hello Team,
I’m creating a custom python source connector. and It’s not an open spec api, so I had to manually creating the nested json schema for the source.
All the stages necessary for creating a python custom connector have passed i.e., spec, check, discover, and read.
When I try to run the pipeline, it fetches the records from the API and it changes the schema at the destination i.e, bigquery
for example: consider the following the column in an object.
“ABC”:{
“type”:[“null”,“number”] }.
The actual value of ABC is “123.0”, so this is a string value and in the destination on bigquery it throws me an error stating " can not convert number_value: “123.0” to integer value: 123.0.
I don’t want the datatype for the schema to change at the destination but it is still doing that, I want it to mirror the structure and the datatypes as well to the destination.
I have tried changing the datatype from number to string, number to integer, and I have also tried using the below structure, but the error remains the same.
Hey the values looked good to me. Ideally this can happen if the schema is configured as number and the value being passed is string. Basically bigquery takes it from the schema. It would be great if you can share the logs of the sync
Hey,
When I ran the pipeline, It gave me status as Success, and the following are the logs and bigquery error that I got when I ran the pipelines.
This was using bigquery (denormalized) logs.txt (16.2 KB) Bigquery-error.txt (498 Bytes)
I have one more log which is using gcs staging and it gave me this error. logs_gcs_staging.txt (1.8 KB)
Hey looks like in thehour_rate field the value is in string and is not able to convert them to number can you change the schema remove "airbyte_type":"number" for hour_rate and let it be in string format only. Can you check if this works
I have tried changing the datatype from number to string, number to integer, and I have also tried using the below structure, but the error remains the same.
If you change data types and keeping having the same problem probably when you rebuild the docker container it didn’t apply changes and used the cache version. Could you try to docker rm your-source to make sure it will build with the schema you want? Sometimes this happens to me too.
As Harshith said probably is better to let this field to be a string for now AND it should work to send data to Bigquery. After you can apply other changes, but let focus in make one step each time
Hey @harshith,
The error for string got solve, But I have a column named timestamps which returns empty/null array.
so the schema for that should ideally be,
“timestamps”:{
"type":["null","array"]
}
This throws an error on the bigquery destination.
I have attached the screenshot for error.
Any help would be appreciated.