Disabling Schema Inferrer in No-Code Connector Builder

Summary

Error occurs when attempting to infer schema in certain cases while testing an endpoint on the no-code connector builder. Error disappears when limiting the number of records being returned.


Question

is there a way to disable schema inferrer when testing an endpoint on the no-code connector builder?

i’m getting this error when making a query, and it disappears if i limit the number of records being returned, so it sounds like there’s some internal airbyte bug when attempting to infer the schema in certain cases:

  File "/home/airbyte/.pyenv/versions/3.9.11/lib/python3.9/site-packages/airbyte_cdk/connector_builder/connector_builder_handler.py", line 62, in read_stream
    stream_read = handler.get_message_groups(source, config, configured_catalog, limits.max_records)
  File "/home/airbyte/.pyenv/versions/3.9.11/lib/python3.9/site-packages/airbyte_cdk/connector_builder/message_grouper.py", line 122, in get_message_groups
    schema = schema_inferrer.get_stream_schema(configured_stream.stream.name)
  File "/home/airbyte/.pyenv/versions/3.9.11/lib/python3.9/site-packages/airbyte_cdk/utils/schema_inferrer.py", line 209, in get_stream_schema
    self._add_required_properties(self._clean(self.stream_to_builder[stream_name].to_schema()))
  File "/home/airbyte/.pyenv/versions/3.9.11/lib/python3.9/site-packages/airbyte_cdk/utils/schema_inferrer.py", line 107, in _clean
    self._clean(value)
  File "/home/airbyte/.pyenv/versions/3.9.11/lib/python3.9/site-packages/airbyte_cdk/utils/schema_inferrer.py", line 107, in _clean
    self._clean(value)
  File "/home/airbyte/.pyenv/versions/3.9.11/lib/python3.9/site-packages/airbyte_cdk/utils/schema_inferrer.py", line 107, in _clean
    self._clean(value)
  File "/home/airbyte/.pyenv/versions/3.9.11/lib/python3.9/site-packages/airbyte_cdk/utils/schema_inferrer.py", line 109, in _clean
    self._clean(node["items"])
  File "/home/airbyte/.pyenv/versions/3.9.11/lib/python3.9/site-packages/airbyte_cdk/utils/schema_inferrer.py", line 107, in _clean
    self._clean(value)
  File "/home/airbyte/.pyenv/versions/3.9.11/lib/python3.9/site-packages/airbyte_cdk/utils/schema_inferrer.py", line 112, in _clean
    if isinstance(node["type"], list):
KeyError: 'type'```

<br>

---

This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. [Click here](https://airbytehq.slack.com/archives/C027KKE4BCZ/p1712737744640839) if you want to access the original thread.

[Join the conversation on Slack](https://slack.airbyte.com)

<sub>
["disable-schema-inferrer", "no-code-connector-builder", "internal-bug", "schema-inference", "error"]
</sub>

No, you can’t disable it, but it’s one of two things — either it’s a bug in the CDK and we should fix it, OR it’s something seriously weird on one of your records, but not the first one. I.e. some records pass, and then the next one breaks.

Can you file an issue in https://github.com/airbytehq/airbyte and mark it with team/extensibility, add this stacktrace and also provide a snippet or http response that triggers this error?

And link to it here — I’ll take a look.