Summary
The user is facing an error while trying to use the S3 connector as a source with AWS credentials. The error indicates a problem with reading the file due to permission issues. The user provided ‘s3:List*’ and ‘s3:Get*’ credentials but still encountered the error.
Question
Hi Team, I am trying to use the s3 connector as a source. I have provided the AWS credential “s3:List*” , “s3:Get*”. But I get the error it cannot read the file.
2023-12-28 13:50:42 [46mplatform[0m > Executing worker wrapper. Airbyte version: 0.50.39
2023-12-28 13:50:42 [46mplatform[0m > Attempt 0 to save workflow id for cancellation
2023-12-28 13:50:42 [46mplatform[0m > Using default value for environment variable SIDECAR_KUBE_CPU_LIMIT: '2.0'
2023-12-28 13:50:42 [46mplatform[0m >
2023-12-28 13:50:42 [46mplatform[0m > Using default value for environment variable SOCAT_KUBE_CPU_LIMIT: '2.0'
2023-12-28 13:50:42 [46mplatform[0m > Using default value for environment variable SIDECAR_KUBE_CPU_REQUEST: '0.1'
2023-12-28 13:50:42 [46mplatform[0m > Using default value for environment variable SOCAT_KUBE_CPU_REQUEST: '0.1'
2023-12-28 13:50:42 [46mplatform[0m > Using default value for environment variable LAUNCHDARKLY_KEY: ''
2023-12-28 13:50:42 [46mplatform[0m > ----- START CHECK -----
2023-12-28 13:50:42 [46mplatform[0m >
2023-12-28 13:50:42 [46mplatform[0m > Checking if airbyte/source-s3:4.3.0 exists...
2023-12-28 13:50:42 [46mplatform[0m > airbyte/source-s3:4.3.0 was found locally.
2023-12-28 13:50:42 [46mplatform[0m > Creating docker container = source-s3-check-57b3f300-cfd9-4009-acf2-dbea2acdf249-0-cesks with resources io.airbyte.config.ResourceRequirements@61d64a51[cpuRequest=,cpuLimit=,memoryRequest=,memoryLimit=,additionalProperties={}] and allowedHosts null
2023-12-28 13:50:42 [46mplatform[0m > Preparing command: docker run --rm --init -i -w /data/57b3f300-cfd9-4009-acf2-dbea2acdf249/0 --log-driver none --name source-s3-check-57b3f300-cfd9-4009-acf2-dbea2acdf249-0-cesks --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e DEPLOYMENT_MODE=OSS -e WORKER_CONNECTOR_IMAGE=airbyte/source-s3:4.3.0 -e AUTO_DETECT_SCHEMA=true -e LAUNCHDARKLY_KEY= -e SOCAT_KUBE_CPU_REQUEST=0.1 -e SOCAT_KUBE_CPU_LIMIT=2.0 -e FIELD_SELECTION_WORKSPACES= -e USE_STREAM_CAPABLE_STATE=true -e AIRBYTE_ROLE= -e WORKER_ENVIRONMENT=DOCKER -e APPLY_FIELD_SELECTION=false -e WORKER_JOB_ATTEMPT=0 -e OTEL_COLLECTOR_ENDPOINT=<http://host.docker.internal:4317> -e FEATURE_FLAG_CLIENT=config -e AIRBYTE_VERSION=0.50.39 -e WORKER_JOB_ID=57b3f300-cfd9-4009-acf2-dbea2acdf249 airbyte/source-s3:4.3.0 check --config source_config.json
2023-12-28 13:50:42 [46mplatform[0m > Reading messages from protocol version 0.2.0
2023-12-28 13:50:44 [46mplatform[0m > Received 66 objects from S3 for prefix 'oneview/powerhub/Account/year=2012/month=10/'.
2023-12-28 13:51:25 [46mplatform[0m > Check failed
2023-12-28 13:51:26 [46mplatform[0m > Check connection job received output: io.airbyte.config.StandardCheckConnectionOutput@4ce8ba19[status=failed,message=['Traceback (most recent call last):\n File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/file_based/availability_strategy/default_file_based_availability_strategy.py", line 95, in _check_parse_record\n record = next(iter(parser.parse_records(stream.config, file, self.stream_reader, logger, discovered_schema=None)))\n File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/file_based/file_types/parquet_parser.py", line 74, in parse_records\n **{\n File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/file_based/file_types/parquet_parser.py", line 75, in <dictcomp>\n column: ParquetParser._to_output_value(batch.column(column)[row], parquet_format)\n File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/file_based/file_types/parquet_parser.py", line 96, in _to_output_value\n return parquet_value.as_py().isoformat()\nAttributeError: \'NoneType\' object has no attribute \'isoformat\'\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/file_based/availability_strategy/default_file_based_availability_strategy.py", line 64, in check_availability_and_parsability\n self._check_parse_record(stream, file, logger)\n File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/file_based/availability_strategy/default_file_based_availability_strategy.py", line 102, in _check_parse_record\n raise CheckAvailabilityError(FileBasedSourceError.ERROR_READING_FILE, stream=stream.name, file=file.uri) from exc\nairbyte_cdk.sources.file_based.exceptions.CheckAvailabilityError: Error opening file. Please check the credentials provided in the config and verify that they provide permission to read files. Contact Support if you need assistance.\nstream=adobe file=oneview/powerhub/Account/year=2012/month=10/part-00000-0b731321-d51a-462d-8644-2e505c58bd36.c000.snappy.parquet\n'],additionalProperties={}]
2023-12-28 13:51:26 [46mplatform[0m >
2023-12-28 13:51:26 [46mplatform[0m > ----- END CHECK -----
2023-12-28 13:51:26 [46mplatform[0m > ```
Am I missing something?
<br>
---
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. [Click here](https://airbytehq.slack.com/archives/C021JANJ6TY/p1703772865785869) if you want to access the original thread.
[Join the conversation on Slack](https://slack.airbyte.com)
<sub>
["s3-connector", "aws-credentials", "error", "file-reading", "permission-issues"]
</sub>