Bug Report: Cursor Pagination Documentation vs Implementation

slack-user-airbyte · December 9, 2024, 5:00pm

Summary

User reports a discrepancy between documented and actual behavior of cursor pagination in API calls, specifically related to URL path handling and request formation.

Question

Hi — I’d like to report a bug, either in the documentation or in the implementation.

The <https://arc.net/l/quote/gvpraegr|Cursor Pagination> documentation states (emphasis mine):

For cursor pagination, if path is selected as the Inject into option, then the entire request URL for the subsequent request will be replaced by the cursor value.
(a_uthor’s note:_ if this is the intended behavior, this injection method should be called URL, not Path)

In practice, this isn’t the case. Take a look at the attached photo.

Here’s are the relevant variables:
• API Base URL: https://coda.io/apis/v1
• Stream URL Path: /docs
• nextPageLink: https://coda.io/apis/v1/docs/wgFd-3K0OL/pages?pageToken=<token>
• Requested URL: https://coda.io/docs/wgFd-3K0OL/pages?pageToken= <token>
As you can see, the nextPageLink includes /apis/v1 , but the requested URL does not.

I suspect (with no evidence and very little conviction) that your algorithm is trying to do some sort of path replacement with the ultimate effect of removing the API Base URL’s path component from the nextPageLink

This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

_{['cursor-pagination', 'documentation-bug', 'api-url', 'path-replacement', 'nextPageLink']}

slack-user-airbyte · December 13, 2024, 8:22am

Out of curiosity, what part of the URL is in your URL is in the global config (API Base URL) and what part is in the stream-level config?

That may be a workaround for now (moving the prefix out), but I’d agree that this is unexpected behavior the Airbyte team should look into. I’m just thinking maybe we can define the problem a little better and get a GitHub issue in

slack-user-airbyte · December 13, 2024, 8:22am

I got it working extracting the token and reappending it as a query string , but agree this definitely seems like a bug and should be addressed either in code or in docs.

To answer your questions:
• The API Base URL in the global configuration is: <https://coda.io/apis/v1>
• The stream configuration adds /docs to the base URL.
I’ve updated my original post to reflect the stream path.

Let me know if I didn’t interpret your question correctly!

slack-user-airbyte · December 13, 2024, 8:22am

hm, definitely a weird one. I’ve been fighting this feature this week while trying to refactor the broken Pardot connector, but my issue is that when it says “replaced” it’s definitely merging . . . as other parameters still get appended. And Pardot’s v5 API doesn’t allow those parameters when a nextPageToken is passed

Took a lot of handling to get that right, but I agree that it should both be URL (or maybe a separate option for URL and Path since different APIs provide one or the other) . . . and then a separate option on whether to merge other parameters or not, since Pardot isn’t the only API that expects only the token on paging requests.

slack-user-airbyte · December 13, 2024, 8:22am

I’m very new to connectors (like 3 hours under my belt), and am only working on this because the existing Coda connector is broken. I agree it’s a bit frustrating.

While I have you, do you have any idea of whether it’s possible to store some data between runs?

Coda returns a “nextSyncToken” which isn’t a timestamp, but a random string that encodes a given data state.

For incremental runs, I would want to pass that token so that Coda only returns new data since that sync token. Notably, the sync token is not a timestamp — it’s a hash.

So ideally I’d be able to include that as a query string, but I need to store it between runs.

slack-user-airbyte · December 13, 2024, 8:22am

Well you’re doing great for 3 hours! I don’t think there’s a way yet in Builder/Low-Code that allows you to inject custom state values (it might be possible through a custom component, but there are limits on that), but it can be done in the CDK.

I do know other APIs that have historically worked this way, so there’s definitely a use case there. I would think some way of storing custom state records would be the most flexible way to handle this (since some APIs may combine it with date-based cursors as well).

<@U069EMNRPA4> will probably sanity check me in case I’m lying to you.

slack-user-airbyte · December 13, 2024, 8:22am

Thanks Justin — appreciate the help!

If it helps, <@U069EMNRPA4>, here’s the <https://arc.net/l/quote/jhytxkdn|specific spot> in the Coda API describing how they provide the non-timestamp sync token for incremental updates.

slack-user-airbyte · December 13, 2024, 8:22am

Hmmmm, I definitely have seen the framework navigate to next page token (i.e. replace full url instead). Suspicious.

slack-user-airbyte · December 13, 2024, 8:23am

<@U069EMNRPA4> what’s the correct course of action here. Should I open up an issue in Github? Or just let this evaporate into the ether?

Topic		Replies	Views
Issue with URL injection in pagination API Connector Development airbyte , api , question , pagination-api , url-injection	3	42	May 14, 2024
Using pagination with paging cursor in output URL key Connector Development pagination , api , question , paging-cursor , output-url	14	52	May 16, 2024
Troubleshooting Cursor Pagination for Connector Connector Development troubleshooting , cursor-pagination , connector , bug	11	26	December 13, 2024
Using Pagination Variables in Stream URL Path Connector Development connector , question , pagination-variables , url-path , streams	2	66	June 20, 2024
Trouble with Pagination in REST API Stream URL Path Connector Questions pagination , connector , question , url-path , rest-api	7	69	November 3, 2024

Bug Report: Cursor Pagination Documentation vs Implementation

Summary

Question

Related topics