Handling date formats in connector for API with CDK

Summary

The user is facing issues with the datetime_format and cursor_datetime_format attributes in the CDK connector for handling date formats in requests and responses.


Question

hi! I’m using the low code CDK to build a connector to an API that requires a date in the request in a different format than it gives back in the body which is annoying for the incremental cursor. I read the docs and see that there are two format attributes datetime_format and cursor_datetime_format that should in theory let me switch between the request and response formats but they aren’t behaving as I would expect based on docs



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want to access the original thread.

Join the conversation on Slack

["connector", "api", "cdk", "date-format", "datetime-format", "cursor-datetime-format"]
    type: DatetimeBasedCursor
    start_datetime:
      datetime: "{{ config['priceDate'] }}"
      datetime_format: "%Y-%m-%d"
    end_datetime:
      datetime: "{{ day_delta(-1, format='%Y-%m-%d') }}"
      datetime_format: "%Y-%m-%d"
    step: "P1D"
    cursor_datetime_format: 
      - "%ms"
    datetime_format: "%Y-%m-%d"
    cursor_granularity: "PT0.001S"
    cursor_field: "priceDate"```

Yes, you’re correct. The low-code CDK allows you to define different datetime formats for the request and response. This is particularly useful when APIs require a different format for the request parameters and the records in the responses. Here’s an example of how you can define a second optional format in your configuration: yaml incremental_sync: cursor_field: "startDate" datetime_format: "%Y-%m-%d" cursor_granularity: "P1D" cursor_datetime_format: "%s" In this example, datetime_format is used for the response records, and cursor_datetime_format is used for the request parameters. If you’re still having issues, it might be due to the way the DatetimeBasedCursor compares dates. It was noted in a Github issue that the DatetimeBasedCursor was initially comparing dates as strings, which could lead to incorrect comparisons. However, this issue has been addressed and now the cursor’s date is parsed into a datetime object for comparisons. If you’re dealing with UNIX timestamps, you can use the %s format as shown in this forum post: yaml incremental_sync: type: DatetimeBasedCursor cursor_field: "created" datetime_format: "%s" cursor_granularity: "PT1S" start_datetime: datetime: "{{ config['start_date'] }}" datetime_format: "%Y-%m-%d %H:%M:%S" end_datetime: datetime: "{{ today_utc() }}" datetime_format: "%Y-%m-%d" step: "P1W" start_time_option: field_name: "created_after" inject_into: "request_parameter" end_time_option: field_name: "created_before" inject_into: "request_parameter" If you’re still facing issues, please provide more details about the problem you’re experiencing.

a sample response value for priceDate (my cursor field) is 1704758400000 and the request should be like 2024-01-09

the parser throws an error that implies to me that at some point before the slice is closed, the 1704758400000 is being parsed with format %Y-%m-%d which strptime obvs can’t do

I am pretty sure I don’t have the format params backwards and yet I get No format in ['%Y-%m-%d'] matching 1704758400000 from the parser

It sounds like you have configured it correctly based on your description, so it is surprising to hear that it is not using the cursor_datetime_format to parse the cursor field in the response.

I’ll need some more details to investigate this:

  1. Are you on Airbyte Cloud or open source? If on open source, please ensure you have updated to the latest version of Airbyte
  2. Which API are you attempting to integrate with? Could you link the API documentation if there is any?
  3. Are you willing to share your YAML manifest?

thanks! 1. open source v 40.7 2. this API which is private and lies in their docs about their date formats anyway :stuck_out_tongue: https://bluehost-sites.s3-eu-west-1.amazonaws.com/Price_Data_API_v2.pdf

whole entire manifest:


definitions:
  requester:
    type: HttpRequester
    url_base: "<https://api.liv-ex.com/>"
    http_method: "POST"
    authenticator:
      type: ApiKeyAuthenticator
      header: "CLIENT_KEY"
      api_token: "{{ config['clientKey'] }}"
    request_headers:
        CLIENT_SECRET: "{{ config['clientSecret'] }}"
    request_body_json:
        currency: "{{ config['currency'] }}"
        priceType: "{{ config['priceType'] }}"
        priceDate: "{{ stream_interval.start_time }}"
        lwin: 
          - "{{ stream_partition.lwin }}"
  retriever:
    type: SimpleRetriever
    record_selector:
      $ref: "#/definitions/selector"
    paginator:
      type: NoPagination
    requester:
      $ref: "#/definitions/requester"
    partition_router:
      $ref: "#/definitions/partition_router"
  partition_router:
    type: ListPartitionRouter
    values: "{{ config['lwin'] }}"
    cursor_field: "lwin"
  selector:
    type: RecordSelector
    extractor:
      type: DpathExtractor
      field_path: ["lwinDetail", "*", "dataDetail", "*"]
  datetime_cursor:
    type: DatetimeBasedCursor
    start_datetime:
      datetime: "{{ config['priceDate'] }}"
      datetime_format: "%Y-%m-%d"
    end_datetime:
      datetime: "{{ day_delta(-1, format='%Y-%m-%d') }}"
      datetime_format: "%Y-%m-%d"
    step: "P1D"
    cursor_datetime_format: "%ms"
    datetime_format: "%Y-%m-%d"
    cursor_granularity: "PT0.001S"
    cursor_field: "priceDate"
  base_stream:
    type: DeclarativeStream
    incremental_sync:
      $ref: "#/definitions/datetime_cursor"
    retriever:
      $ref: "#/definitions/retriever"
  price_stream:
    $ref: "#/definitions/base_stream"
    name: "prices"
    primary_key: 
      - "priceDate"
      - "lwin"
    $parameters:
      path: "/data/v2/priceData"
    transformations:
      - type: AddFields
        fields:
          - path: ["lwin"]
            value: "{{ stream_partition.lwin }}"

streams:
  - "#/definitions/price_stream"

check:
  type: CheckStream
  stream_names:
    - "prices"

spec:
  type: Spec
  documentation_url: <https://bluehost-sites.s3-eu-west-1.amazonaws.com/Price_Data_API_v2.pdf>
  connection_specification:
    title: LivEx Price API v2 Spec
    type: object
    required:
      - clientKey
      - clientSecret
      - lwin
      - currency
      - priceDate
      - priceType
    additionalProperties: true
    properties:
      clientKey:
        type: string
        description: access key
        airbyte_secret: true       
      clientSecret:
        type: string
        description: password
        airbyte_secret: true
      lwin:
        type: array
        description: list of LWIN11s
        items:
          type: string
      currency:
        type: string
        description: USD, GBP, EUR, etc.
      priceDate:
        type: string
        description: Y-m-d ISO format eg 2023-01-01
      priceType:
        type: array
        description: list of letter codes, A is market price
        items:
          type: string```

btw this is failing in local dev with just running main.py of my connector with the most current version of Airbyte CDK

<@U06E43U5NUW> it looks like your manifest specifies cursor_datetime_format, not cursor_datetime_formats (notice the plural)

Try changing that field on line 48 to cursor_datetime_formats and set the value to ["%ms"] instead of "%ms" - this should fix the problem

:woman-facepalming: thank you!

yep flawless response and date handling now! I can read I swear

If you’re interested, the Connector Builder is a UI on top of the YAML which can help avoid small typos like this by generating the YAML for you, and just requiring you to enter in the values