Nested Data Unwrapping in Connector Builder

Summary

Exploring a way to unwrap nested data in Connector Builder to sync content from each parent stream individually.


Question

Trying not to give up on Connector Builder I thought I had an idea on how to unwrap nested data. Is there way way to not call these for all combinations of the two Parent Stream ? I just want each one to sync the content from each parent stream instead of all combinations of the two.



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here.
Join the conversation on Slack

Click here if you want to access the original thread.

`["nested-data-unwrapping", "connector-builder", "parent-stream", "sync"]`

Defining multiple parent streams on one child stream always results in each combination of each parent key being called for the child stream.

To iterate over each parent stream’s key just once, you’ll need to create two separate child streams and point each one to a single parent.

You mentioned doing this to “unwrap nested data” - could you expand on this more? What is the shape of the data in each of your parents that you are trying to iterate over for your child stream?

<@U02T7NVJ6A3> I am trying to unwrap GitBook data where there are an unknown number of nested pages arrays ahead of time:

    "id":"10000",
    "pages" : [
        {
            "id":"20000",
            "pages" : []
            
        },
        {
            "id":"21000",
            "pages" : [
                {
                    "id":"30000",
                    "pages" : []
                },
                {
                    "id":"31000",
                    "pages" : []
                }
            ]
        }
    ]
}```

I am looking for a way to get the “contents” of all the pages and basically flatten the structure with AirByte.

This is what I currently have:
https://github.com/airbytehq/airbyte/discussions/38072

But am stuck on how to proceed. AirByte is so close to being able to handle this but seems not possible with connector builder.

Unfortunately the connector builder currently does not offer a way to flatten arbitrarily-nested API responses like this into a list of records. But I have noted down this request in our backlog.

For now, your best option is to implement a CustomRecordExtractor with the behavior you need and use it in your connector.

https://docs.airbyte.com/connector-development/config-based/advanced-topics#custom-components|Here is the documentation on custom components, and https://github.com/airbytehq/airbyte-platform/blob/main/airbyte-connector-builder-server/README.md#running-the-platform-with-support-for-custom-components-docker-compose-only|here is the guide on running the Builder with support for custom components (currently only supported on local deployments)