I started working on a new connector source-dynamodb
. (see the PR #14555)
The way I’m thinking of developing the new connector is to add each listed table as a stream (detailed tables could be narrowed with a prefix query or directly with the IAM permissions provided).
My doubt is as follows. NoSQL databases won’t enforce any schema besides the primary/partition key and sort keys, it’s there a json_schema
I can use that is flexible? or should I dynamically build the schema based on a recorded sample retrieved?
Useful links:
- Azure Table connector code base: airbyte/reader.py at master · airbytehq/airbyte · GitHub
class Reader:
...
@property
def get_typed_schema(self) -> object:
"""Static schema for tables"""
return {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {"data": {"type": "object"}, "additionalProperties": {"type": "boolean"}},
}
- Neither
azure table
noraws dyamodb
retrieve as schema containing all fields: node.js - How to get all fields present in aws-dynamoDB? - Stack Overflow
DynamoDB is a NoSQL database. While creating the table, it doesn't expect all the attributes to be defined in the table. Also, each item can have any number of non-key attributes. Only key attributes are common or mandatory across all the items in the table.
For the above-mentioned reasons, the described table can't provide all the attributes present in the table.