From a user experience perspective, are there any guidelines for schema discovery time? I’m writing a source connector for a source that has approximately 500 tables. Currently it takes about 3.5 - 4 minutes to discover all the schemas. I don’t have much experience using Airbyte with other connectors which dynamically discover schemas for a large amount of tables, so I am not capable of determining if this duration is usable or if it is abnormal. Any input is appreciated. Thank you.
We want the schema discovery to be as fast as possible so if you have ways of optimizing discovery it would be very much appreciated.
For example, if your source allows, one kind of optimization is retrieving only the set of tables on which the user has read permissions. You could also allow users to pre-determine the set of schema/tables they want to replicate.
The discover schema call from the web app is synchronous, so if Airbyte users are using a custom reverse proxy on top of the Airbyte API, they might face some timeout errors if your schema discovery is too long. But in general, long schema discoveries are not problematic technically speaking, the problem is mainly poor UX.
Let me know if this helps.