Getting multiple values from a parent stream in Airbyte connector


This message is asking how to retrieve multiple values from a parent stream in Airbyte connector development.


This shows how to get a value from a parent stream. How can I get multiple values from the parent stream?

This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here.
Join the conversation on Slack

Click here if you want to access the original thread.

`["get-multiple-values", "parent-stream", "airbyte-connector"]`

You don’t; this is just to get the key you need to sync another stream, then the expectation would be that you’d join those tables in your data warehouse or modeling tool

In the case I am working on there are multiple keys need from the parent stream to be able to sync the next set of data. For instance I need (org, space, page, content_id) to sync the next stream

~So I haven’t tried this, but you could try using “Add New Parent Stream” and selecting the same one multiple times and adding that value with the different keys, which you can then assign their own names. I’d start there and see if it handles that correctly since they’re all references to the same stream or whether it tries to match all the values of one to all the others~

~The other thing that might be possible is to concatenate (either in the editor field or within the jinja with ~) multiple values together with a delimiter, then when you use that field split it out to the one you want. For example, the parent key could be:~
~{{ <|> }}:{{ }}:{{ }}:{{ config.content_id }}~

~{{ <|>~':'':'':'~config.content_id }}~

~. . . and then when you need to use it, you could try something like:~
~{{ stream_partition.parent_fields.split(':')[0] }}~

~I’m not entirely sure which jinja functions or filters it’ll let you use, but this might work if the multiple parent thing isn’t interpreted correctly~

My understanding of that is is will call all combinations of the parent streams.

actually, I don’t think that will work, because I think you only have access to what you put as a primary key

but I’m not sure how that works when you specify all of those things as the primary key on the parent stream

from what I saw in that API, they don’t all require those values in all the requests. or if they do, maybe you could also set some of them in the connection config and not have to pull them from the stream. but definitely not as dynamic if you have to do that.

So I’m thinking that being able to specify an array of values to Record Selector’s Field Path (or maybe even support the ** style that Transformations use) would be a great feature request to handle this in Builder connectors, since a lot of connectors are only not buildable in Builder because of the need to add a Custom Extractor for similar reasons. I’d double-check that a similar feature isn’t there.

In the meantime, you could look into moving this from Builder to normal Low-Code following <|this tutorial>, and could use the <airbyte/airbyte-integrations/connectors/source-iterable/source_iterable/manifest.yaml at master · airbytehq/airbyte · GitHub Source> as an example of a Low-Code connector that implements custom record extractors for record selection (specifically for the list_users, users, and events streams—these just extend the default DpathExtractor in|

From there, you could make a stream that contains ALL of the page IDs (maybe along with their parent/child IDs), and use that as the parent stream to pull the actual detailed data.

I’ll ask around if anyone else has ideas on how you could do this within Builder, but I feel like Record Selector is probably the biggest driver of me having to jump out of Builder to add custom extractors right now.