Source Google Sheets - Connection to Google Sheets fails when google sheets has a Data Connector to Big Query

  • Is this your first time deploying Airbyte?: Yes
  • OS Version / Instance: Linux on AWS EC2. 2 vCPU.
  • Memory / Disk: 8 GBs memory
  • Deployment: Docker
  • Airbyte Version: 0.40.0-alpha
  • Source name/version: Google Sheets / 0.2.17
  • Destination name/version: BigQuery / 1.1.16
  • Step: An old source, while loading and while retesting the connection
  • Description:
    We had this running sync from Google Sheets to BigQuery working for around 15 days, and it was an every 24 hours sync. Two days ago, inside this Google Sheets, a user added a “Data Connector”, which connects to BigQuery, and returns the results in a new tab.
    The source load now started to fail because of this new tab, even though we are not even selecting it.
    Plus, when I try to “retest” the connection to Google Sheets source, I have the same error as when in the sync.

Now I wonder if there is a workaround to this, without having to impact the user’s work and removing this new tab.

Screenshot of the “retest source”:

Full log:

2022-09-15 07:49:49 e[32mINFOe[m i.a.v.j.JsonSchemaValidator(test):71 - JSON schema validation failed. 
errors: $.client_id: is missing but it is required, $.client_secret: is missing but it is required, $.refresh_token: is missing but it is required, $.auth_type: must be a constant value Client
2022-09-15 07:49:49 e[32mINFOe[m i.a.v.j.JsonSchemaValidator(test):71 - JSON schema validation failed. 
errors: $.service_account_info: object found, string expected
2022-09-15 07:49:49 e[32mINFOe[m i.a.v.j.JsonSchemaValidator(test):71 - JSON schema validation failed. 
errors: $.method: must be a constant value Standard
2022-09-15 07:49:49 e[32mINFOe[m i.a.v.j.JsonSchemaValidator(test):71 - JSON schema validation failed. 
errors: $.credential.hmac_key_access_id: object found, string expected, $.credential.hmac_key_secret: object found, string expected
2022-09-15 07:49:49 e[32mINFOe[m i.a.w.t.TemporalAttemptExecution(get):105 - Docker volume job log path: /tmp/workspace/326/0/logs.log
2022-09-15 07:49:49 e[32mINFOe[m i.a.w.t.TemporalAttemptExecution(get):110 - Executing worker wrapper. Airbyte version: 0.40.0-alpha
2022-09-15 07:49:49 e[32mINFOe[m i.a.c.i.LineGobbler(voidCall):83 - Checking if airbyte/source-google-sheets:0.2.17 exists...
2022-09-15 07:49:49 e[32mINFOe[m i.a.c.i.LineGobbler(voidCall):83 - airbyte/source-google-sheets:0.2.17 was found locally.
2022-09-15 07:49:49 e[32mINFOe[m i.a.w.p.DockerProcessFactory(create):108 - Creating docker job ID: 326
2022-09-15 07:49:49 e[32mINFOe[m i.a.w.p.DockerProcessFactory(create):163 - Preparing command: docker run --rm --init -i -w /data/326/0 --log-driver none --name source-google-sheets-check-326-0-fdhqt --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e DEPLOYMENT_MODE=OSS -e USE_STREAM_CAPABLE_STATE=true -e AIRBYTE_ROLE= -e WORKER_ENVIRONMENT=DOCKER -e WORKER_JOB_ATTEMPT=0 -e WORKER_CONNECTOR_IMAGE=airbyte/source-google-sheets:0.2.17 -e AIRBYTE_VERSION=0.40.0-alpha -e WORKER_JOB_ID=326 airbyte/source-google-sheets:0.2.17 check --config source_config.json
2022-09-15 07:49:57 e[1;31mERRORe[m i.a.w.i.DefaultAirbyteStreamFactory(internalLog):95 - Giving up get(...) after 1 tries (googleapiclient.errors.HttpError: <HttpError 400 when requesting https://sheets.googleapis.com/v4/spreadsheets/1I4CfINSAHY8F7RWahjBy1pyrqase4pJPWWZEA8LnAm8?includeGridData=true&ranges=Connected+sheet+2%211%3A1&alt=json returned "Unable to parse range: Connected sheet 2!1:1". Details: "Unable to parse range: Connected sheet 2!1:1">)
2022-09-15 07:49:57 e[1;31mERRORe[m i.a.w.i.DefaultAirbyteStreamFactory(internalLog):95 - <HttpError 400 when requesting https://sheets.googleapis.com/v4/spreadsheets/1I4CfINSAHY8F7RWahjBy1pyrqase4pJPWWZEA8LnAm8?includeGridData=true&ranges=Connected+sheet+2%211%3A1&alt=json returned "Unable to parse range: Connected sheet 2!1:1". Details: "Unable to parse range: Connected sheet 2!1:1">
2022-09-15 07:49:57 e[1;31mERRORe[m i.a.w.i.DefaultAirbyteStreamFactory(internalLog):95 - Check failed
2022-09-15 07:49:57 e[32mINFOe[m i.a.w.t.TemporalAttemptExecution(get):131 - Stopping cancellation check scheduling...
2022-09-15 07:49:57 e[32mINFOe[m i.a.v.j.JsonSchemaValidator(test):71 - JSON schema validation failed. 
errors: $.client_id: is missing but it is required, $.client_secret: is missing but it is required, $.refresh_token: is missing but it is required, $.auth_type: must be a constant value Client
2022-09-15 07:49:57 e[32mINFOe[m i.a.v.j.JsonSchemaValidator(test):71 - JSON schema validation failed. 
errors: $.service_account_info: object found, string expected
2022-09-15 07:49:57 e[32mINFOe[m i.a.v.j.JsonSchemaValidator(test):71 - JSON schema validation failed. 
errors: $.method: must be a constant value Standard
2022-09-15 07:49:57 e[32mINFOe[m i.a.v.j.JsonSchemaValidator(test):71 - JSON schema validation failed. 
errors: $.credential.hmac_key_access_id: object found, string expected, $.credential.hmac_key_secret: object found, string expected
2022-09-15 07:49:57 e[32mINFOe[m i.a.c.f.EnvVariableFeatureFlags(autoDisablesFailingConnections):17 - Auto Disable Failing Connections: false

Hello there! You are receiving this message because none of your fellow community members has stepped in to respond to your topic post. (If you are a community member and you are reading this response, feel free to jump in if you have the answer!) As a result, the Community Assistance Team has been made aware of this topic and will be investigating and responding as quickly as possible.
Some important considerations that will help your to get your issue solved faster:

  • It is best to use our topic creation template; if you haven’t yet, we recommend posting a followup with the requested information. With that information the team will be able to more quickly search for similar issues with connectors and the platform and troubleshoot more quickly your specific question or problem.
  • Make sure to upload the complete log file; a common investigation roadblock is that sometimes the error for the issue happens well before the problem is surfaced to the user, and so having the tail of the log is less useful than having the whole log to scan through.
  • Be as descriptive and specific as possible; when investigating it is extremely valuable to know what steps were taken to encounter the issue, what version of connector / platform / Java / Python / docker / k8s was used, etc. The more context supplied, the quicker the investigation can start on your topic and the faster we can drive towards an answer.
  • We in the Community Assistance Team are glad you’ve made yourself part of our community, and we’ll do our best to answer your questions and resolve the problems as quickly as possible. Expect to hear from a specific team member as soon as possible.

Thank you for your time and attention.
Best,
The Community Assistance Team

Hi @patricia.goldberg,

Thanks for your post and welcome to the community! After reading through our documentation around the Google Sheets connector, my guess is that the inclusion of the new tab is causing the sync to fail (something you probably already suspect.)

The Google Sheets connector treats each tab as a separate stream and I’m guessing that it’s not able to parse the new tab that your user created. It looks like it’s failing trying to read “Connected sheet 2”, is that the name of a sheet or a tab?

The simplest solution might be to just move the output data from BigQuery to a separate spreadsheet so that it’s not part of the daily sync. Other than that, we’ll have to investigate why it’s not able to parse the rows of the new tab. Could you perhaps share a couple of rows to get a sense of what the data looks like?

Hope this answer helps!

Yes, that was my suspicion indeed. And yes, the name of the tab is “Connected sheet 2”.

And the output of the data (what I see on Google sheets at least), it’s just like a normal table:

id|name|address|
|A789|Test|test|
|A800|TEST|Teststr.|
|T001|Testmarkt 1|St.-Martin-Str. 72, 81541 München|
|T002|Testmarkt 2|Birkenleiten 43, 81543 München|

Hey @patricia.goldberg,

Is the sync successful if you move the output data from BigQuery to another spreadsheet? Furthermore, I escalated the issue to our github repo so please feel free to follow there for any updates from the engineering team relating to this issue: https://github.com/airbytehq/airbyte/issues/17100

Hey!
Yes, it is. Thank you for creating the issue.