Custom Connector Builder - Issue with Dataset Creation in BigQuery

Summary

When using the custom connector builder to create a connector with 2 streams and 1 connection in Airbyte, only 1 dataset is being created in BigQuery instead of the expected 2 datasets.


Question

Hi <@U04SE163WC8>, I create a custom connector using the builder with 2 streams. I have 1 connection with 1 source from that custom connector. This connection has both streams enabled. I get only 1 dataset and 2 raw datasets on bigquery.
How come there is only 1 dataset created ? I expected 2



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want to access the original thread.

Join the conversation on Slack

["custom-connector-builder", "dataset-creation", "bigquery", "streams", "connection"]

In Airbyte, a connection links up a configured source and a configured destination to perform syncs. It defines things like the replication frequency and which streams to replicate. However, the number of datasets created in your destination (BigQuery in your case) does not directly correspond to the number of streams in your connection. The datasets in BigQuery are not created per stream, but rather per namespace. A namespace in Airbyte can be thought of as a container for a set of streams. Therefore, if your two streams are under the same namespace, they will be placed in the same dataset in BigQuery. The raw datasets you’re seeing are part of Airbyte’s two-step process for loading data into destinations like BigQuery. In the first step, records are loaded into raw tables in a raw schema. In the second step, the data is normalized according to the destination’s normalization rules, and then loaded into the final tables in the dataset. If you want your streams to be in separate datasets, you would need to configure them to use different namespaces. However, this might not be possible depending on the specifics of your custom connector and source. For more information, you can refer to the Airbyte documentation and the BigQuery destination documentation.

<@U04SE163WC8> my bad, I wanted to day 2 tables. I get 2 raw tables but only 1 table with the data of 1 of my streams. I expected the 2nd table to be created corresponding to the 2nd stream.

A message from http://kapa.ai|kapa.ai

In Airbyte, each stream from your source should be mapped to a separate table in your destination, in this case, BigQuery. If you have two streams enabled in your connection, you should indeed see two tables in BigQuery, each corresponding to one of your streams. If you’re only seeing one table, there could be a few reasons for this: 1. There might be an issue with the data in the second stream. If the data doesn’t meet the requirements of BigQuery or if there’s an error during the sync process, the table might not be created. Check the logs for any error messages related to this stream. 2. The second stream might not have any new or updated data since the last sync. Airbyte only writes data to the destination for new or updated records. If there’s no new data in the second stream, the corresponding table might not be created in BigQuery. 3. There might be a configuration issue with your connection or your custom connector. Make sure that both streams are properly configured and enabled in your connection. To troubleshoot this issue, you can start by checking the logs for your sync operation. Look for any error messages or warnings related to the second stream. You can also try running a manual sync and see if the second table is created. For more information on how Airbyte handles streams and tables, you can refer to the Airbyte documentation and the BigQuery destination documentation.

thanks, the syncs where good but a log line was showing there where no properties set for that specific stream. Fixed that and it’s working