Summary
The user is looking for a workaround to avoid providing bigquery.datasets.create
permission to the service account used for syncing data with BigQuery. They have already manually created the datasets and are seeking a solution to bypass the permission requirement for a ‘connection check’. The user has ensured that the BigQuery service account has BigQuery User
and BigQuery Data Editor
roles, and has created an airbyte_internal
dataset.
Question
Hi everyone,
I wanted to ask if there is any workaround for not providing bigquery.datasets.create
permissions to the service account that big query will use to sync data with the destination, (The datasets it will be creating already have been manually created).
I do not wish to provide create/delete permissions for a “connection check”
"message": "Access Denied: Project xxxx: User does not have bigquery.datasets.create permission in project xxxx.",
And the documentation nowhere mentions the requirement for bigquery.datasets.create
• Make sure the BigQuery service account has BigQuery User
and BigQuery Data Editor
roles or equivalent permissions as those two roles.
And i have also created airbyte_internal
dataset
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.
Join the conversation on Slack
["workaround", "bigquery-datasets-create", "service-account", "sync-data", "permission", "bigquery-user", "bigquery-data-editor", "airbyte-internal-dataset"]
<@U06CZK6S9K8> I tried opening a PR, i’m unable to find any airbyte-cli that is compatible with my Intel Macbook Pro 2019, it does not support ARM 64, so git push fails due to the pre-commit formatting checks, there is no support for Rosetta, and i can’t find any other solutions
diff --git a/airbyte-integrations/connectors/destination-bigquery/src/main/java/io/airbyte/integrations/destination/bigquery/BigQueryUtils.java b/airbyte-integrations/connectors/destination-bigquery/src/main/java/io/airbyte/integrations/destination/bigquery/BigQueryUtils.java
index 5202b9b7eb..d831fd8064 100644
--- a/airbyte-integrations/connectors/destination-bigquery/src/main/java/io/airbyte/integrations/destination/bigquery/BigQueryUtils.java
+++ b/airbyte-integrations/connectors/destination-bigquery/src/main/java/io/airbyte/integrations/destination/bigquery/BigQueryUtils.java
@@ -89,6 +89,7 @@ public class BigQueryUtils {
public static Dataset getOrCreateDataset(final BigQuery bigquery, final String datasetId, final String datasetLocation) {
Dataset dataset = bigquery.getDataset(datasetId);
if (dataset == null || !dataset.exists()) {
+ checkHasCreateAndDeleteDatasetRole(bigquery, datasetId, datasetLocation);
final DatasetInfo datasetInfo = DatasetInfo.newBuilder(datasetId).setLocation(datasetLocation).build();
dataset = bigquery.create(datasetInfo);
}
diff --git a/airbyte-integrations/connectors/destination-bigquery/src/main/kotlin/io/airbyte/integrations/destination/bigquery/BigQueryDestination.kt b/airbyte-integrations/connectors/destination-bigquery/src/main/kotlin/io/airbyte/integrations/destination/bigquery/BigQueryDestination.kt
index 73ce327846..619d23b3dd 100644
--- a/airbyte-integrations/connectors/destination-bigquery/src/main/kotlin/io/airbyte/integrations/destination/bigquery/BigQueryDestination.kt
+++ b/airbyte-integrations/connectors/destination-bigquery/src/main/kotlin/io/airbyte/integrations/destination/bigquery/BigQueryDestination.kt
@@ -69,9 +69,8 @@ class BigQueryDestination : BaseConnector(), Destination {
val bigquery = getBigQuery(config)
val uploadingMethod = getLoadingMethod(config)
- checkHasCreateAndDeleteDatasetRole(bigquery, datasetId, datasetLocation)
-```
Here is the minor change that should most likely fix the issue
<@U07713SM8J0> During the actual sync we actually skip creation if dataset already exists. For check this was missed https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-bigquery/src/main/java/io/airbyte/integrations/destination/bigquery/BigQueryUtils.java#L102|here. Quick workaround would be to just create a dummy dataset for check to pass and your actual dataset name when creating connection could be a different one with lesser permissions. (Happy to accept PR for fixing the check part)