Passing array for one of the field in spec.yaml file

I have a requirement where I need to pull multiple data from the api. Right now what are doing we have created a custom connector as a source where I’m pulling some data which is schema less and we transform the data into JSON and dump it into the Postgres. I have a spec file which looks like this

documentationUrl: https://docs.airbyte.io/integrations/sources/surveycto
connectionSpecification:
  $schema: http://json-schema.org/draft-07/schema#
  title: Surveycto Spec
  type: object
  required:
    - server_name
    - username
    - password
    - form_id
  properties:
    server_name:
      type: string
      title: Server Name
      description: The name of the SurveryCTO server
      order: 0
    username:
      type: string
      title: Username
      order: 1
    password:
      type: string
      title: Password
      airbyte_secret: true
      order: 2
    form_id:
      type:  string
      title: Form ID
      order: 3
class SoStream(HttpStream, ABC):
    primary_key = None

    def __init__(self, config: Mapping[str, Any], **kwargs):
        super().__init__()
        self.server_name = config['server_name']
        self.form_id = config['form_id']
        #base64 encode username and password as auth token
        user_name_password = f"{config['username']}:{config['password']}"
        self.auth_token = self._base64_encode(user_name_password)
        
        
    def next_page_token(self, response: requests.Response) -> Optional[Mapping[str, Any]]:
        return None

    def _base64_encode(self,string:str) -> str:
        return base64.b64encode(string.encode("ascii")).decode("ascii")

    @property
    def url_base(self) -> str:
        return f"https://{self.server_name}.surveycto.com/api/v2/forms/data/wide/json/"

    def request_params(
        self, stream_state: Mapping[str, Any], stream_slice: Mapping[str, any] = None, next_page_token: Mapping[str, Any] = None
    ) -> MutableMapping[str, Any]:
        return {'date': 0}

    def request_headers(
        self, stream_state: Mapping[str, Any], stream_slice: Mapping[str, Any] = None, next_page_token: Mapping[str, Any] = None
    ) -> Mapping[str, Any]:
        return {'Authorization': 'Basic ' + self.auth_token }

    def parse_response(
        self,
        response: requests.Response,
        stream_state: Mapping[str, Any],
        stream_slice: Mapping[str, Any] = None,
        next_page_token: Mapping[str, Any] = None,
    ) -> Iterable[Mapping]:
        response_json = response.json()

        for data in response_json:
            try:
                yield data
            except Exception as e:
                msg = f"""Encountered an exception parsing schema"""
                self.logger.exception(msg)
                raise e
  
    def path(
        self, 
        stream_state: Mapping[str, Any] = None, 
        stream_slice: Mapping[str, Any] = None, 
        next_page_token: Mapping[str, Any] = None
    ) -> str:
        return self.form_id

Right now it pulls data from the particular form_id which is string now. I need to pass multiple form_id in the spec file. Which can be an array but when I change it to an array and run the code again its giving me this error

Config validation error: ‘test_data’ is not of type ‘array’

how can I pass multiple form IDs here and what changes should I make so that I can pull data from different form_id?

Hey @siddhant3030! Could you give me a bit more info - what is ‘test_data’, it’s in a separate file, right? Could you provide that here as well? I think the bug might be there.

Test data is the name of the form_id from where I’m calling this API basically
"https://{self.server_name}.surveycto.com/api/v2/forms/data/wide/json/"

This will give me a JSON response which is a list. But in order to do that I was passing as a string basically


 form_id:
      type:  string
      title: Form ID
      order: 3

It pulls data when the form_id is a string.

Now what I want to do is basically pass multiple form_id which will be an array
and then somehow it can call the API for each form_id and give a list of that response

[“test_data”, “test_data1”]

Still looking into this, but I’ve found a thread that might be helpful to you where a user was trying to do a similar thing:
https://discuss.airbyte.io/t/using-an-array-instead-of-a-string-in-the-config/586

The solution implements stream slices:
https://docs.airbyte.com/connector-development/cdk-python/stream-slices/

I did implement this. But somehow I’m still getting the same error

2022-10-06 19:31:36 ERROR i.a.w.i.DefaultAirbyteStreamFactory(internalLog):99 - Config validation error: 'test_data' is not of type 'array'
Traceback (most recent call last):
  File "/airbyte/integration_code/main.py", line 13, in <module>
    launch(source, sys.argv[1:])
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/entrypoint.py", line 123, in launch
    for message in source_entrypoint.run(parsed_args):
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/entrypoint.py", line 96, in run
    check_config_against_spec_or_exit(connector_config, source_spec)
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/utils/schema_helpers.py", line 160, in check_config_against_spec_or_exit
    raise Exception("Config validation error: " + validation_error.message) from None
Exception: Config validation error: 'test_data' is not of type 'array'
2022-10-06 19:31:36 ERROR i.a.w.g.DefaultCheckConnectionWorker(run):96 - Error checking connection, status: Optional.empty, exit code: 1

I don’t know why it’s not able to take the array type for the field. Is something wrong with YAML file?

    form_id:
      type:  array
      title: Form ID
      description: Unique identifier for one of your forms
      order: 3

Could you please share the complete code for the connector? That way I could test it out locally!