Getting multiple values from a parent stream in Airbyte connector

slack-user-airbyte · May 14, 2024, 2:32pm

Summary

This message is asking how to retrieve multiple values from a parent stream in Airbyte connector development.

Question

This shows how to get a value from a parent stream. How can I get multiple values from the parent stream?

https://docs.airbyte.com/connector-development/connector-builder-ui/partitioning#example-1

This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here.
Join the conversation on Slack

Click here if you want to access the original thread.

_{`["get-multiple-values", "parent-stream", "airbyte-connector"]`}

slack-user-airbyte · May 14, 2024, 4:32pm

You don’t; this is just to get the key you need to sync another stream, then the expectation would be that you’d join those tables in your data warehouse or modeling tool

slack-user-airbyte · May 14, 2024, 4:32pm

In the case I am working on there are multiple keys need from the parent stream to be able to sync the next set of data. For instance I need (org, space, page, content_id) to sync the next stream

slack-user-airbyte · May 14, 2024, 4:32pm

~So I haven’t tried this, but you could try using “Add New Parent Stream” and selecting the same one multiple times and adding that value with the different keys, which you can then assign their own names. I’d start there and see if it handles that correctly since they’re all references to the same stream or whether it tries to match all the values of one to all the others~

slack-user-airbyte · May 14, 2024, 4:32pm

~The other thing that might be possible is to concatenate (either in the editor field or within the jinja with ~) multiple values together with a delimiter, then when you use that field split it out to the one you want. For example, the parent key could be:~
~{{ <http://config.org|config.org> }}:{{ config.space }}:{{ config.page }}:{{ config.content_id }}~

~or:~
~{{ <http://config.org|config.org>~':'~config.space~':'~config.page~':'~config.content_id }}~

~. . . and then when you need to use it, you could try something like:~
~{{ stream_partition.parent_fields.split(':')[0] }}~

~I’m not entirely sure which jinja functions or filters it’ll let you use, but this might work if the multiple parent thing isn’t interpreted correctly~

slack-user-airbyte · May 14, 2024, 4:32pm

My understanding of that is is will call all combinations of the parent streams.

slack-user-airbyte · May 14, 2024, 4:32pm

actually, I don’t think that will work, because I think you only have access to what you put as a primary key

slack-user-airbyte · May 14, 2024, 4:32pm

but I’m not sure how that works when you specify all of those things as the primary key on the parent stream

slack-user-airbyte · May 14, 2024, 4:32pm

from what I saw in that API, they don’t all require those values in all the requests. or if they do, maybe you could also set some of them in the connection config and not have to pull them from the stream. but definitely not as dynamic if you have to do that.

slack-user-airbyte · May 14, 2024, 4:32pm

So I’m thinking that being able to specify an array of values to Record Selector’s Field Path (or maybe even support the ** style that Transformations use) would be a great feature request to handle this in Builder connectors, since a lot of connectors are only not buildable in Builder because of the need to add a Custom Extractor for similar reasons. I’d double-check that a similar feature isn’t there.

In the meantime, you could look into moving this from Builder to normal Low-Code following <https://docs.airbyte.com/connector-development/config-based/low-code-cdk-overview|this tutorial>, and could use the <airbyte/airbyte-integrations/connectors/source-iterable/source_iterable/manifest.yaml at master · airbytehq/airbyte · GitHub Source> as an example of a Low-Code connector that implements custom record extractors for record selection (specifically for the list_users, users, and events streams—these just extend the default DpathExtractor in https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-iterable/source_iterable/components.py|components.py).

From there, you could make a stream that contains ALL of the page IDs (maybe along with their parent/child IDs), and use that as the parent stream to pull the actual detailed data.

I’ll ask around if anyone else has ideas on how you could do this within Builder, but I feel like Record Selector is probably the biggest driver of me having to jump out of Builder to add custom extractors right now.

Topic		Replies	Views
Custom Connector - Selecting Multiple Values from Parent Stream API, Terraform and Other Topics airbyte-cloud , connectorbuilder , question , custom-connector , parent-stream	1	25	December 13, 2024
Custom Connector Parent Stream Field Selection Inquiry Connector Development airbyte-cloud , connectorbuilder , question , custom-connector , parent-stream	5	23	December 13, 2024
Using parameters from a parent to children stream in Airbyte connector development Connector Questions connector-development , connector , question , yaml , parameters	2	45	September 13, 2024
Creating a Connector for a Public API with Multiple Streams in Airbyte UI Connector Creator Connector Development airbyte-ui , connectorbuilder , question , public-api , connector-creator	0	76	May 14, 2024
Parallel fetching of parent details in Airbyte connector builder API, Terraform and Other Topics connectorbuilder , question , airbyte-connector-builder , substream , parallel-fetching	0	8	August 7, 2024

Getting multiple values from a parent stream in Airbyte connector

Summary

Question

Related topics