REST API to Bigquery

Hello Team,
I’m creating a custom python source connector. and It’s not an open spec api, so I had to manually creating the nested json schema for the source.

All the stages necessary for creating a python custom connector have passed i.e., spec, check, discover, and read.

When I try to run the pipeline, it fetches the records from the API and it changes the schema at the destination i.e, bigquery
for example: consider the following the column in an object.

“ABC”:{
“type”:[“null”,“number”] }.
The actual value of ABC is “123.0”, so this is a string value and in the destination on bigquery it throws me an error stating " can not convert number_value: “123.0” to integer value: 123.0.

I don’t want the datatype for the schema to change at the destination but it is still doing that, I want it to mirror the structure and the datatypes as well to the destination.

I have tried changing the datatype from number to string, number to integer, and I have also tried using the below structure, but the error remains the same.

Hey can you share the schema and one record over here ?

Hello Harshith,
I have attached the schema and one record entry, please let me know if you need anything else!

events_schema.txt (8.0 KB)
timely-entry.txt (3.3 KB)

Hey the values looked good to me. Ideally this can happen if the schema is configured as number and the value being passed is string. Basically bigquery takes it from the schema. It would be great if you can share the logs of the sync

Hey,
When I ran the pipeline, It gave me status as Success, and the following are the logs and bigquery error that I got when I ran the pipelines.
This was using bigquery (denormalized)
logs.txt (16.2 KB)
Bigquery-error.txt (498 Bytes)

I have one more log which is using gcs staging and it gave me this error.
logs_gcs_staging.txt (1.8 KB)

Let me know if you need anything else!

Hey Harshith,
any update on the same? @harshith

Hey looks like in thehour_rate field the value is in string and is not able to convert them to number can you change the schema remove "airbyte_type":"number" for hour_rate and let it be in string format only. Can you check if this works

Hey @harshith,
The error is the same.
It didn’t work, the status is a success but still fails at the destination connector!

Hey @harshith,
I’m out of ideas right now on how to get around this error, any help idea would be appreciated!

I have tried changing the datatype from number to string, number to integer, and I have also tried using the below structure, but the error remains the same.

If you change data types and keeping having the same problem probably when you rebuild the docker container it didn’t apply changes and used the cache version. Could you try to docker rm your-source to make sure it will build with the schema you want? Sometimes this happens to me too.

As Harshith said probably is better to let this field to be a string for now AND it should work to send data to Bigquery. After you can apply other changes, but let focus in make one step each time :innocent:

    },
    "hour_rate":{
      "type":["null","string"],
    },

Hey @marcosmarxm,
The error is still the same.

Hey so that we can understand the schema can you run

docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-name:dev discover --config /secrets/config.json

Hey @harshith,
The error for string got solve, But I have a column named timestamps which returns empty/null array.
so the schema for that should ideally be,
“timestamps”:{
Screen Shot 2022-05-18 at 1.04.04 PM

  "type":["null","array"]

}

This throws an error on the bigquery destination.
I have attached the screenshot for error.
Any help would be appreciated.

Have you given the type of Array is it array of string or array of objects (if array of objects give the structure or object)

Hey,
I have tried changing the data type from array to object, but there is always an error. It’s an empty array.

Hey is there error in the airbyte logs while you are running this sync? If so can you share the logs with us?.

Also to ensure that it works you can remove the field and see if the sync succeds

Hey,
I have attached the log files.

logs-69.txt (81.4 KB)
logs-64.txt (81.9 KB)

Hey would suggest these 2 things

  1. Can you try with different database and see if that works?
  2. Is it possible to share schema (as a file) along with one sample row (as a file)

Hey @harshith,
It worked when I’m not using GCS staging, when I use GCS staging I get this error.
Can you guide me with this
logs-118.txt (79.1 KB)

Hello @harshith,
the connector is working fine, can you guide me how can I contribute this custom connector to Airbyte?