Custom API Connector Issue with Record Parent Key

Summary

The user is facing an issue with a custom API connector where they are unable to include the record parent key in the final record output. They provided the desired data format and the current format provided by the API.


Question

Hello! I hope you’re all well. I’ve run into an issue while creating a custom API connector for (https://api.growthepie.xyz/v1/master.json). Specifically, I’m unable to get the record parent key included in the final record output. How do I do this? To clarify, I’m hoping to get the following data format:

   {
      "origin_key": "chain_name",
      "name": "....",
      "deployment": ....",
      "name_short": "...",
      "description": ".....",
      "da_layer": "-",
   }
]```
from the current format provided by the API:
```[  ....,
   chains: [
   "chain_name": {
        "name": "....",
        "deployment": ....",
        "name_short": "...",
        "description": ".....",
        "da_layer": "-",
      }
   ]
]```


<br>

---

This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. [Click here](https://airbytehq.slack.com/archives/C021JANJ6TY/p1715118186165729) if you want to access the original thread.

[Join the conversation on Slack](https://slack.airbyte.com)

<sub>
["custom-api-connector", "record-parent-key", "data-format", "api"]
</sub>

In your case I think the first would likely look like:
*, chains, *, {{ stream_partition.&lt;identifier&gt; }}

(although I can’t see the full nested structure in your format—you may need to also need to combine the use of the <http://{{%20stream_partition.<identifier>%20}}|Record Filter>)

When you say parent ID, I assumed this was a stream based on a parent stream—is that correct?

If so, the jinja I showed before will work in the same way as in the URL.

If not, if you merely mean the parent in the JSON, it’ll need to be done a little differently

Hm, I’m actually not sure that this is possible from me playing around with it a bit. The main issues are that the default DpathExtractor used for Record Selector doesn’t have a way of splitting the array up whilst preserving the key (at least that I’m aware of), and I also don’t see any way to reference the parent object (since record gets filtered to the Record Selector target).

My initial thought for a workaround was to fake it an have one stream that gets the list of chains, then add a child stream for that which would extract the values for each parent (which would give you access to the current value in stream_partition). Unfortunately I can’t figure out how to get dpath to extract the object keys instead of the values, so I wasn’t able to get a workable parent stream list.

It’s possible that there may be a way to fake this using a list of values in Parameterized Requests, but otherwise I think you’d have to create and implement a CustomRecordExtractor (if you’re on OSS). Unless anyone else has any better ideas for getting at that key

I suppose another option would be to leave it as just chains, and then disable the final output tables for the connection, which would at least get you the raw JSON of the chains to work with as a single row in a DB table. Not ideal, but maybe something.

Thank you for trying and the explanation! It’s very validating to know that my thought process/workaround testing was correct/logical. I think I may see what I can do with a record selection of just chains

So I think you need to do two things here. First, you need to update the record selector to incorporate the parent ID (using the same jinja you’re using to pull it into the stream URL). It can be a little tricky the first time working with mixed nested objects and arrays, but there are more <https://docs.airbyte.com/connector-development/config-based/understanding-the-yaml-file/record-selector|details noted here>.

Second, you can go to the bottom and enable the Transformations panel, then change the type on the right side to Add Field, then set the name to what you want and the value to that same jinja. This is actually covered on a different section of the same page above for https://docs.airbyte.com/connector-development/config-based/understanding-the-yaml-file/record-selector|Transformations. As noted there:
> Another common use case of the “add” transformation is the enriching of records with their parent resource - check out the <Partitioning | Airbyte Documentation documentation> for more details.

Thank you <@U035912NS77>! I definitely appreciate the help. I think further clarification is necessary. The response I get from the API looks like:

 "status": 200,
  "body": {
    "current_version": "v1",
    "default_chain_selection": [
      "arbitrum",
      "base",
      "zksync_era",
      "optimism",
      "linea"
    ],
    "chains": {
      "ethereum": {
        "name": "Ethereum",
        "deployment": "PROD",
        "name_short": "Ethereum",
        "description": "Ethereum was proposed by Vitalik Buterin in 2013 and launched in 2015. It is arguably the most decentralized smart contract platform to date. The goal is to scale Ethereum through the usage of Layer 2s.",
        "da_layer": "-",
        "symbol": "ETH",
        "bucket": "Layer 1",
        "technology": "Mainnet",
        "purpose": "General Purpose (EVM)",
        "launch_date": "2015-07-30",
        "l2beat_stage": null,
        "raas": null,
        "website": "<https://ethereum.org/>",
        "twitter": "<https://twitter.com/ethereum>",
        "block_explorer": "<https://etherscan.io/>",
        "rhino_listed": true,
        "rhino_naming": "ETHEREUM"
      },
      "polygon_zkevm": {
        "name": "Polygon zkEVM",
        "deployment": "PROD",
        "name_short": "Polygon",
        "description": "Polygon zkEVM uses zero-knowledge proofs to enable faster and cheaper transactions. It allows users to build and run EVM-compatible smart contracts. It's fully compatible with the Ethereum Virtual Machine, making it easy for developers to migrate their applications to the Polygon network. It launched in March 2023.",
        "da_layer": "Ethereum (calldata)",
        "symbol": "MATIC",
        "bucket": "ZK-Rollups",
        "technology": "ZK Rollup",
        "purpose": "General Purpose (EVM)",
        "launch_date": "2023-03-24",
        "l2beat_stage": "Stage 0",
        "raas": null,
.... ```
Looking at this response I use `chains,*` in the Record Selector to get the following format:
```[
 {
    "name": "Ethereum",
    "deployment": "PROD",
    "name_short": "Ethereum",
    "description": "Ethereum was proposed by Vitalik Buterin in 2013 and launched in 2015. It is arguably the most decentralized smart contract platform to date. The goal is to scale Ethereum through the usage of Layer 2s.",
    "da_layer": "-",
    "symbol": "ETH",
    "bucket": "Layer 1",
    "technology": "Mainnet",
    "purpose": "General Purpose (EVM)",
    "launch_date": "2015-07-30",
    "website": "<https://ethereum.org/>",
    "twitter": "<https://twitter.com/ethereum>",
    "block_explorer": "<https://etherscan.io/>",
    "rhino_listed": true,
    "rhino_naming": "ETHEREUM"
  },
  {
    "name": "Polygon zkEVM",
    "deployment": "PROD",
    "name_short": "Polygon",
    "description": "Polygon zkEVM uses zero-knowledge proofs to enable faster and cheaper transactions. It allows users to build and run EVM-compatible smart contracts. It's fully compatible with the Ethereum Virtual Machine, making it easy for developers to migrate their applications to the Polygon network. It launched in March 2023.",
    "da_layer": "Ethereum (calldata)",
    "symbol": "MATIC",
    "bucket": "ZK-Rollups",
    "technology": "ZK Rollup",
    "purpose": "General Purpose (EVM)",
    "launch_date": "2023-03-24",
    "l2beat_stage": "Stage 0",
    "website": "<https://polygon.technology/polygon-zkevm>",
    "twitter": "<https://twitter.com/0xPolygon>",
    "block_explorer": "<https://zkevm.polygonscan.com/>",
    "rhino_listed": true,
    "rhino_naming": "ZKEVM"
  },
.....```
This is already pretty good but I want the parent_key for each of these records included so I get something like:
```[
 {
    "origin_key": "ethereum",
    "name": "Ethereum",
    "deployment": "PROD",
    "name_short": "Ethereum",
    "description": "Ethereum was proposed by Vitalik Buterin in 2013 and launched in 2015. It is arguably the most decentralized smart contract platform to date. The goal is to scale Ethereum through the usage of Layer 2s.",
    "da_layer": "-",
    "symbol": "ETH",
    "bucket": "Layer 1",
    "technology": "Mainnet",
    "purpose": "General Purpose (EVM)",
    "launch_date": "2015-07-30",
    "website": "<https://ethereum.org/>",
    "twitter": "<https://twitter.com/ethereum>",
    "block_explorer": "<https://etherscan.io/>",
    "rhino_listed": true,
    "rhino_naming": "ETHEREUM"
  },
  {
    "origin_key": "polygon_zkevm",
    "name": "Polygon zkEVM",
    "deployment": "PROD",
    "name_short": "Polygon",
    "description": "Polygon zkEVM uses zero-knowledge proofs to enable faster and cheaper transactions. It allows users to build and run EVM-compatible smart contracts. It's fully compatible with the Ethereum Virtual Machine, making it easy for developers to migrate their applications to the Polygon network. It launched in March 2023.",
    "da_layer": "Ethereum (calldata)",
    "symbol": "MATIC",
    "bucket": "ZK-Rollups",
    "technology": "ZK Rollup",
    "purpose": "General Purpose (EVM)",
    "launch_date": "2023-03-24",
    "l2beat_stage": "Stage 0",
    "website": "<https://polygon.technology/polygon-zkevm>",
    "twitter": "<https://twitter.com/0xPolygon>",
    "block_explorer": "<https://zkevm.polygonscan.com/>",
    "rhino_listed": true,
    "rhino_naming": "ZKEVM"
  },
.....```
I tried using the transformation section to add a field but have no idea how to call on the record parent key/identifier. Is this even possible? Let me know if there's anything that you want me to confirm/correct

Ahh. Apologies on the poor word choice, This is not a stream based on a parent stream. All the necessary details are within the JSON response of a single stream. Your latter response is correct, I’m trying to get parent key within the JSON