Appending values to response body using transformation for debugging

Summary

Inquiring about the possibility of appending values like Page Number and Page Size to the response body using a transformation for debugging purposes when records are being silently dropped.


Question

Also, is it possible to append values like Page Number and Page Size to the response body using a transformation? This would be extremely helpful for debugging my situation where records are being silently dropped.



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want to access the original thread.

Join the conversation on Slack

["append-values", "response-body", "transformation", "debugging", "records-dropped"]

Based on the information provided in the extracts, it seems that Airbyte does not directly support appending values like Page Number and Page Size to the response body using a transformation. However, you can use the “Inject page size / limit / offset into outgoing HTTP request” option in the pagination form for most cases. If the API has special requirements that can’t be handled this way, you can disable injection in the pagination form and use the generic parameter section at the bottom of the stream configuration form to freely configure query parameters, headers, and properties of the JSON body, by using jinja expressions and available variables. You can also use these variables as part of the URL path. Here is an example from the documentation where the next_page_token variable is used to configure a query parameter with key limit and value {{ next_page_token['next_page_token'] or '0' }},50 to inject the offset from the pagination strategy and a hardcoded limit of 50 into the same parameter. python { "query": { ... }, "sort": "{'field': 'updated_at', 'order': 'ascending'}", "pagination": { "starting_after": "{{ next_page_token.get('next_page_token').get('starting_after') }}", "per_page": 10 } } However, this does not directly append the values to the response body. For more complex transformations, you might need to perform them in the downstream warehouse after the data has been loaded. If you are experiencing issues with records being dropped, it might be worth checking if there are any issues with the pagination or if the records are being filtered out during the extraction process. If you are still having trouble, consider opening a new issue on the Airbyte Github or asking for help on the Airbyte forum.

What would be the use case for this specific request?
And, probably in the CDK it is possible if you pass the request URL as a return value, so you can use it and slice it as needed, but I have not tried this tbh.

Hey Jose, the use-case for making source request metadata available in the destination would be to ensure completeness and to aid in debugging. For example, if there are 500 pages, each with 1000 records, having this metadata in Snowflake would allow me to quickly verify and isolate any issue with missing records (like the one I’m currently suffering from. See prior comment).

Does the source API has a total count? Some API endpoints provide a total count of elements and total number of pages. Are you using Low Code or CDK?

No, the target API it doesn’t return any helpful metadata like this. Would be nice if it did. I’m using the Low Code approach right now.