Acceptance test is failing on a schema with a "nested" cursor

cornjuliox · April 26, 2022, 7:56am

Given a schema like so:

{
    "$schema": "http://json-schema.org/draft-07/schema",
    "name": "events",
    "type": "object",
    "properties": {
        "id": {"type": "integer"},
        "icon": {"type": "string"},
        "file": {"type": "string"},
        "item": {"type": "object"},
        "location": {"type": "string"},
        "created_at": {
            "type": "object",
            "properties": {
                "datetime": {"type": "string"},
                "formatted": {"type": "string"}
            }
        },
        "updated_at": {
            "type": "object",
            "properties": {
                "datetime": {"type": "string"},
                "formatted": {"type": "string"}
            }
        },
        "next_audit_date": {"type": "object"},
        "days_to_next_audit": {"type": "integer"},
        "action_type": {"type": "string"},
        "admin": {"type": "object"}
    }
}

and a cursor field of “updated_at/datetime”, the test_defined_cursors_exist_in_schema() test in the acceptance test suite is failing and doesn’t look like it would ever succeed in my case:

    def test_defined_cursors_exist_in_schema(self, connector_config, discovered_catalog):
        """
        Check if all of the source defined cursor fields are exists on stream's json schema.
        """
        for stream_name, stream in discovered_catalog.items():
            if stream.default_cursor_field:
                schema = stream.json_schema
                assert "properties" in schema, "Top level item should have an 'object' type for {stream_name} stream schema"
                properties = schema["properties"]    # <--right here
                cursor_path = "/properties/".join(stream.default_cursor_field)
                assert dpath.util.search(
                    properties, cursor_path
                ), f"Some of defined cursor fields {stream.default_cursor_field} are not specified in discover schema properties for {stream_name} stream"

As it’s written, the test will extract the top-level “properties” key, but “properties”
keys in nested schemas will remain as-is, causing the subsequent dpath.util.search() call to fail because the dpath is going to be missing the 2nd “properties” .

Check out the following pdb postmortem:

(Pdb++) stream.default_cursor_field
['updated_at/datetime']
(Pdb++) cursor_path 
'updated_at/datetime'
(Pdb++) dpath.util.search(properties, "/updated_at/datetime")
{}
(Pdb++) properties
{'id': {'type': 'integer'}, 'icon': {'type': 'string'}, 'file': {'type': 'string'}, 'item': {'type': 'object'}, 'location': {'type': 'string'}, 'created_at': {'type': 'object'}, 'updated_at': {'type': 'object', 'properties': {'datetime': {'type': 'string'}, 'formatted': {'type': 'string'}}}, 'next_audit_date': {'type': 'object'}, 'days_to_next_audit': {'type': 'integer'}, 'action_type': {'type': 'string'}, 'admin': {'type': 'object'}}
(Pdb++) dpath.util.search(properties,"/updated_at/properties/datetime")
{'updated_at': {'properties': {'datetime': {'type': 'string'}}}}
(Pdb++)

Assuming that the test is correct, how exactly are we supposed to specify the cursor field in an incremental stream?

marcosmarxm · April 26, 2022, 7:30pm

Airbyte doesn’t support nested cursor fields for incremental syncs see issue.

For nested cursor fields did you read Airbyte’s docs about cursor?
See implementation for Jira issues stream or some Tiktok streams

I think the best way would be transform the record to extract the nested field to a higher level.

cornjuliox · April 27, 2022, 4:02am

For nested cursor fields did you read Airbyte’s docs about cursor?

I did see that, yeah, but the list-style wasn’t working for me at the time and when I asked about it in Airbyte Slack someoene suggested the format I used above and it’s been working ever since.

marcosmarxm · April 27, 2022, 10:38pm

The connector is working but only acceptance tests are not passing?

cornjuliox · April 28, 2022, 3:53am

Yes, that’s right. The connector’s been working fine but the acceptance tests are not passing.

marcosmarxm · May 2, 2022, 2:25pm

I created this issue in the Github: https://github.com/airbytehq/airbyte/issues/12510

marcosmarxm · July 13, 2022, 12:00am

Hi there from the Community Assistance team.
We’re letting you know about an issue we discovered with the back-end process we use to handle topics and responses on the forum. If you experienced a situation where you posted the last message in a topic that did not receive any further replies, please open a new topic to continue the discussion. In addition, if you’re having a problem and find a closed topic on the subject, go ahead and open a new topic on it and we’ll follow up with you. We apologize for the inconvenience, and appreciate your willingness to work with us to provide a supportive community.

Topic		Replies	Views
Acceptance tests configured catalog Connector Development connectors , cdc	1	511	July 25, 2022
Providing datetime field in cursor for custom connector with nested structure Connector Questions connector , question , custom-connector , cursor , incremental	1	11	October 12, 2024
Selecting a good cursor field when there aren't any Connector Development	5	618	July 14, 2022
After recent update json schema validation fails Connector Questions & Issues connectors	19	5974	July 14, 2022
Using multiple fields as a cursor for incremental syncs Connector Development data-loading , connectors	4	715	September 22, 2022

Acceptance test is failing on a schema with a "nested" cursor

Related topics