Unable to fetch schema from Excel file present in SFTP server

Hi,
i am trying to make a connection using following
Source: File (Storage Provider - SFTP)
Destination: Redshift

My Source & destination are configured correctly but when i try to make a connection it becomes unable to fetch the schema for the specified excel file.

Here is my source configuration:

Here is my sftp server having the same file:

Here is the sample file that i used
housing.xls.txt (22.5 KB)
P.s. Please remove the extension .txt from above file before using it, I just added it so that it can be uploaded here in Topic

Hey could you help me with the response for that API call?

You can go to network tab and then select the failed response for the discover API

I have the same issue. Appears to be a FileNotFoundError. Response from the network tab is below.

{"catalog":null,"jobInfo":{"id":"326c46fb-97aa-4bb3-95e6-3362c0e55780","configType":"discover_schema","configId":"Optional[778daa7c-feaf-4db6-96f3-70fd645acc77]","createdAt":1657687016331,"endedAt":1657687019482,"succeeded":false,"logs":{"logLines":["2022-07-13 04:36:56 \u001B[32mINFO\u001B[m i.a.w.t.TemporalAttemptExecution(get):108 - Docker volume job log path: /tmp/workspace/2ce30a0d-4018-44d7-b9dd-74535c746a94/0/logs.log","2022-07-13 04:36:56 \u001B[32mINFO\u001B[m i.a.w.t.TemporalAttemptExecution(get):113 - Executing worker wrapper. Airbyte version: 0.38.2-alpha","2022-07-13 04:36:56 \u001B[32mINFO\u001B[m i.a.c.i.LineGobbler(voidCall):82 - Checking if airbyte/source-file:0.2.10 exists...","2022-07-13 04:36:56 \u001B[32mINFO\u001B[m i.a.c.i.LineGobbler(voidCall):82 - airbyte/source-file:0.2.10 was found locally.","2022-07-13 04:36:56 \u001B[32mINFO\u001B[m i.a.w.p.DockerProcessFactory(create):108 - Creating docker job ID: 2ce30a0d-4018-44d7-b9dd-74535c746a94","2022-07-13 04:36:56 \u001B[32mINFO\u001B[m i.a.w.p.DockerProcessFactory(create):163 - Preparing command: docker run --rm --init -i -w /data/2ce30a0d-4018-44d7-b9dd-74535c746a94/0 --log-driver none --name source-file-discover-2ce30a0d-4018-44d7-b9dd-74535c746a94-0-ftknw --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e WORKER_CONNECTOR_IMAGE=airbyte/source-file:0.2.10 -e WORKER_JOB_ATTEMPT=0 -e AIRBYTE_ROLE= -e WORKER_ENVIRONMENT=DOCKER -e AIRBYTE_VERSION=0.38.2-alpha -e WORKER_JOB_ID=2ce30a0d-4018-44d7-b9dd-74535c746a94 airbyte/source-file:0.2.10 discover --config source_config.json","2022-07-13 04:36:58 \u001B[32mINFO\u001B[m i.a.w.p.a.DefaultAirbyteStreamFactory(internalLog):97 - Discovering schema of orders at sftp:///home/ftpuser/orders.xlsx...","2022-07-13 04:36:58 \u001B[33mWARN\u001B[m i.a.w.p.a.DefaultAirbyteStreamFactory(internalLog):96 - ignoring unsupported keyword arguments: ['connect_kwargs']","2022-07-13 04:36:59 \u001B[1;31mERROR\u001B[m i.a.w.p.a.DefaultAirbyteStreamFactory(internalLog):95 - Failed to discover schemas of orders at sftp:///home/ftpuser/orders.xlsx: FileNotFoundError(2, 'No such file or directory')","Traceback (most recent call last):","  File \"/airbyte/integration_code/source_file/source.py\", line 103, in discover","    streams = list(client.streams)","  File \"/airbyte/integration_code/source_file/client.py\", line 363, in streams","    \"properties\": self._stream_properties(),","  File \"/airbyte/integration_code/source_file/client.py\", line 351, in _stream_properties","    for df in df_list:","  File \"/airbyte/integration_code/source_file/client.py\", line 306, in load_dataframes","    yield reader(fp, **reader_options)","  File \"/usr/local/lib/python3.9/site-packages/pandas/util/_decorators.py\", line 299, in wrapper","    return func(*args, **kwargs)","  File \"/usr/local/lib/python3.9/site-packages/pandas/io/excel/_base.py\", line 336, in read_excel","    io = ExcelFile(io, storage_options=storage_options, engine=engine)","  File \"/usr/local/lib/python3.9/site-packages/pandas/io/excel/_base.py\", line 1062, in __init__","    ext = inspect_excel_format(","  File \"/usr/local/lib/python3.9/site-packages/pandas/io/excel/_base.py\", line 938, in inspect_excel_format","    with get_handle(","  File \"/usr/local/lib/python3.9/site-packages/pandas/io/common.py\", line 648, in get_handle","    handle = open(handle, ioargs.mode)","FileNotFoundError: [Errno 2] No such file or directory: '<paramiko.sftp_file.SFTPFile object at 0x7feaaee166d0>'","","2022-07-13 04:36:59 \u001B[1;31mERROR\u001B[m i.a.w.p.a.DefaultAirbyteStreamFactory(internalLog):95 - [Errno 2] No such file or directory: '<paramiko.sftp_file.SFTPFile object at 0x7feaaee166d0>'","Traceback (most recent call last):","  File \"/airbyte/integration_code/main.py\", line 13, in <module>","    launch(source, sys.argv[1:])","  File \"/usr/local/lib/python3.9/site-packages/airbyte_cdk/entrypoint.py\", line 127, in launch","    for message in source_entrypoint.run(parsed_args):","  File \"/usr/local/lib/python3.9/site-packages/airbyte_cdk/entrypoint.py\", line 112, in run","    catalog = self.source.discover(self.logger, config)","  File \"/airbyte/integration_code/source_file/source.py\", line 107, in discover","    raise err","  File \"/airbyte/integration_code/source_file/source.py\", line 103, in discover","    streams = list(client.streams)","  File \"/airbyte/integration_code/source_file/client.py\", line 363, in streams","    \"properties\": self._stream_properties(),","  File \"/airbyte/integration_code/source_file/client.py\", line 351, in _stream_properties","    for df in df_list:","  File \"/airbyte/integration_code/source_file/client.py\", line 306, in load_dataframes","    yield reader(fp, **reader_options)","  File \"/usr/local/lib/python3.9/site-packages/pandas/util/_decorators.py\", line 299, in wrapper","    return func(*args, **kwargs)","  File \"/usr/local/lib/python3.9/site-packages/pandas/io/excel/_base.py\", line 336, in read_excel","    io = ExcelFile(io, storage_options=storage_options, engine=engine)","  File \"/usr/local/lib/python3.9/site-packages/pandas/io/excel/_base.py\", line 1062, in __init__","    ext = inspect_excel_format(","  File \"/usr/local/lib/python3.9/site-packages/pandas/io/excel/_base.py\", line 938, in inspect_excel_format","    with get_handle(","  File \"/usr/local/lib/python3.9/site-packages/pandas/io/common.py\", line 648, in get_handle","    handle = open(handle, ioargs.mode)","FileNotFoundError: [Errno 2] No such file or directory: '<paramiko.sftp_file.SFTPFile object at 0x7feaaee166d0>'","2022-07-13 04:36:59 \u001B[32mINFO\u001B[m i.a.w.t.TemporalAttemptExecution(lambda$getWorkerThread$2):161 - Completing future exceptionally...","io.airbyte.workers.WorkerException: Discover job subprocess finished with exit code 1","\tat io.airbyte.workers.DefaultDiscoverCatalogWorker.run(DefaultDiscoverCatalogWorker.java:74) ~[io.airbyte-airbyte-workers-0.38.2-alpha.jar:?]","\tat io.airbyte.workers.DefaultDiscoverCatalogWorker.run(DefaultDiscoverCatalogWorker.java:24) ~[io.airbyte-airbyte-workers-0.38.2-alpha.jar:?]","\tat io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:158) ~[io.airbyte-airbyte-workers-0.38.2-alpha.jar:?]","\tat java.lang.Thread.run(Thread.java:833) [?:?]","2022-07-13 04:36:59 \u001B[32mINFO\u001B[m i.a.w.t.TemporalAttemptExecution(get):134 - Stopping cancellation check scheduling...","2022-07-13 04:36:59 \u001B[33mWARN\u001B[m i.t.i.a.POJOActivityTaskHandler(activityFailureToResult):307 - Activity failure. ActivityId=c75c10a5-d964-356b-b320-7bf3145eb058, activityType=Run, attempt=1","java.util.concurrent.ExecutionException: io.airbyte.workers.WorkerException: Discover job subprocess finished with exit code 1","\tat java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) ~[?:?]","\tat java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) ~[?:?]","\tat io.airbyte.workers.temporal.TemporalAttemptExecution.get(TemporalAttemptExecution.java:132) ~[io.airbyte-airbyte-workers-0.38.2-alpha.jar:?]","\tat io.airbyte.workers.temporal.discover.catalog.DiscoverCatalogActivityImpl.run(DiscoverCatalogActivityImpl.java:84) ~[io.airbyte-airbyte-workers-0.38.2-alpha.jar:?]","\tat jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]","\tat jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[?:?]","\tat jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]","\tat java.lang.reflect.Method.invoke(Method.java:568) ~[?:?]","\tat io.temporal.internal.activity.POJOActivityTaskHandler$POJOActivityInboundCallsInterceptor.execute(POJOActivityTaskHandler.java:214) ~[temporal-sdk-1.8.1.jar:?]","\tat io.temporal.internal.activity.POJOActivityTaskHandler$POJOActivityImplementation.execute(POJOActivityTaskHandler.java:180) ~[temporal-sdk-1.8.1.jar:?]","\tat io.temporal.internal.activity.POJOActivityTaskHandler.handle(POJOActivityTaskHandler.java:120) ~[temporal-sdk-1.8.1.jar:?]","\tat io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:204) ~[temporal-sdk-1.8.1.jar:?]","\tat io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:164) ~[temporal-sdk-1.8.1.jar:?]","\tat io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:93) ~[temporal-sdk-1.8.1.jar:?]","\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]","\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]","\tat java.lang.Thread.run(Thread.java:833) [?:?]","Caused by: io.airbyte.workers.WorkerException: Discover job subprocess finished with exit code 1","\tat io.airbyte.workers.DefaultDiscoverCatalogWorker.run(DefaultDiscoverCatalogWorker.java:74) ~[io.airbyte-airbyte-workers-0.38.2-alpha.jar:?]","\tat io.airbyte.workers.DefaultDiscoverCatalogWorker.run(DefaultDiscoverCatalogWorker.java:24) ~[io.airbyte-airbyte-workers-0.38.2-alpha.jar:?]","\tat io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:158) ~[io.airbyte-airbyte-workers-0.38.2-alpha.jar:?]","\t... 1 more"]}},"catalogId":null}
1 Like

Hi there from the Community Assistance team.
We’re letting you know about an issue we discovered with the back-end process we use to handle topics and responses on the forum. If you experienced a situation where you posted the last message in a topic that did not receive any further replies, please open a new topic to continue the discussion. In addition, if you’re having a problem and find a closed topic on the subject, go ahead and open a new topic on it and we’ll follow up with you. We apologize for the inconvenience, and appreciate your willingness to work with us to provide a supportive community.