Destination S3 key error when moving large amount of data

  • Is this your first time deploying Airbyte?: Yes

  • OS Version / Instance: Ubuntu

  • Memory / Disk: 16Gb / 200Gb

  • Deployment: Docker

  • Airbyte Version: 0.39.20-alpha

  • Source name/version: source-postgres/0.4.25

  • Destination name/version: destination-redshift/0.3.39

  • Step: The issue is happening during sync, when loading data from s3 to redshift

  • Description: This won’t happen when I moving small amount of data however when moving big chunk of data (50G up) I can find something like this in the log:

2022-06-19 05:50:54 e[43mdestinatione[0m > Details: -----------------------------------------------
2022-06-19 05:50:54 e[43mdestinatione[0m >   error:  Mandatory url is not present in manifest file.
2022-06-19 05:50:54 e[43mdestinatione[0m >   code:      8001
2022-06-19 05:50:54 e[43mdestinatione[0m >   context:   Manifest file location=s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/0e670848-f88f-491d-b31c-c3b23b893c25.manifest url=s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_
2022-06-19 05:50:54 e[43mdestinatione[0m >   query:     84680
2022-06-19 05:50:54 e[43mdestinatione[0m >   location:  s3_utility.cpp:400
2022-06-19 05:50:54 e[43mdestinatione[0m >   process:   padbmaster [pid=16418]
2022-06-19 05:50:54 e[43mdestinatione[0m >   -----------------------------------------------;
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at com.amazon.redshift.client.messages.inbound.ErrorResponse.toErrorException(Unknown Source) ~[redshift-jdbc42-no-awssdk-1.2.51.1078.jar:RedshiftJDBC_1.2.51.1078]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at com.amazon.redshift.client.PGMessagingContext.handleErrorResponse(Unknown Source) ~[redshift-jdbc42-no-awssdk-1.2.51.1078.jar:RedshiftJDBC_1.2.51.1078]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at com.amazon.redshift.client.PGMessagingContext.handleMessage(Unknown Source) ~[redshift-jdbc42-no-awssdk-1.2.51.1078.jar:RedshiftJDBC_1.2.51.1078]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at com.amazon.jdbc.communications.InboundMessagesPipeline.getNextMessageOfClass(Unknown Source) ~[redshift-jdbc42-no-awssdk-1.2.51.1078.jar:RedshiftJDBC_1.2.51.1078]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at com.amazon.redshift.client.PGMessagingContext.doMoveToNextClass(Unknown Source) ~[redshift-jdbc42-no-awssdk-1.2.51.1078.jar:RedshiftJDBC_1.2.51.1078]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at com.amazon.redshift.client.PGMessagingContext.getErrorResponse(Unknown Source) ~[redshift-jdbc42-no-awssdk-1.2.51.1078.jar:RedshiftJDBC_1.2.51.1078]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at com.amazon.redshift.client.PGClient.handleErrorsScenario3(Unknown Source) ~[redshift-jdbc42-no-awssdk-1.2.51.1078.jar:RedshiftJDBC_1.2.51.1078]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at com.amazon.redshift.client.PGClient.handleErrors(Unknown Source) ~[redshift-jdbc42-no-awssdk-1.2.51.1078.jar:RedshiftJDBC_1.2.51.1078]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at com.amazon.redshift.dataengine.PGQueryExecutor$CallableExecuteTask.call(Unknown Source) ~[redshift-jdbc42-no-awssdk-1.2.51.1078.jar:RedshiftJDBC_1.2.51.1078]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at com.amazon.redshift.dataengine.PGQueryExecutor$CallableExecuteTask.call(Unknown Source) ~[redshift-jdbc42-no-awssdk-1.2.51.1078.jar:RedshiftJDBC_1.2.51.1078]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
2022-06-19 05:50:54 e[43mdestinatione[0m > 2022-06-19 05:50:54 e[1;31mERRORe[m i.a.i.b.AirbyteExceptionHandler(uncaughtException):26 - Something went wrong in the connector. See the logs for more details.
2022-06-19 05:50:54 e[43mdestinatione[0m > java.lang.RuntimeException: Failed to upload data from stage source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at io.airbyte.integrations.destination.staging.StagingConsumerFactory.lambda$onCloseFunction$4(StagingConsumerFactory.java:204) ~[io.airbyte.airbyte-integrations.connectors-destination-jdbc-0.39.7-alpha.jar:?]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at io.airbyte.integrations.destination.buffered_stream_consumer.OnCloseFunction.accept(OnCloseFunction.java:9) ~[io.airbyte.airbyte-integrations.bases-base-java-0.39.7-alpha.jar:?]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at io.airbyte.integrations.destination.buffered_stream_consumer.BufferedStreamConsumer.close(BufferedStreamConsumer.java:179) ~[io.airbyte.airbyte-integrations.bases-base-java-0.39.7-alpha.jar:?]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at io.airbyte.integrations.base.FailureTrackingAirbyteMessageConsumer.lambda$close$0(FailureTrackingAirbyteMessageConsumer.java:67) ~[io.airbyte.airbyte-integrations.bases-base-java-0.39.7-alpha.jar:?]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at io.airbyte.integrations.base.sentry.AirbyteSentry.executeWithTracing(AirbyteSentry.java:54) ~[io.airbyte.airbyte-integrations.bases-base-java-0.39.7-alpha.jar:?]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at io.airbyte.integrations.base.FailureTrackingAirbyteMessageConsumer.close(FailureTrackingAirbyteMessageConsumer.java:67) ~[io.airbyte.airbyte-integrations.bases-base-java-0.39.7-alpha.jar:?]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at io.airbyte.integrations.base.IntegrationRunner.runInternal(IntegrationRunner.java:166) ~[io.airbyte.airbyte-integrations.bases-base-java-0.39.7-alpha.jar:?]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at io.airbyte.integrations.base.IntegrationRunner.run(IntegrationRunner.java:107) ~[io.airbyte.airbyte-integrations.bases-base-java-0.39.7-alpha.jar:?]
2022-06-19 05:50:54 e[43mdestinatione[0m > 	at io.airbyte.integrations.destination.redshift.RedshiftDestination.main(RedshiftDestination.java:62) ~[io.airbyte.airbyte-integrations.connectors-destination-redshift-0.39.7-alpha.jar:?]
2022-06-19 05:50:54 e[43mdestinatione[0m > Caused by: java.lang.RuntimeException: java.sql.SQLException: [Amazon](500310) Invalid operation: Mandatory url is not present in manifest file

So seems like airbyte couldn’t find the files in s3 so I check the manifest file in the bucket and print out all the url of the files as below:

s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/0.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/1.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/2.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/3.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/4.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/5.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/6.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/7.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/8.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/9.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/10.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/11.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/12.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/13.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/14.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/15.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/16.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/17.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/18.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/19.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/20.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/21.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/22.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/23.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/24.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/25.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/26.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/27.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/28.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/29.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/30.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/31.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/32.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/33.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/34.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/35.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/0.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/1.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/2.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/3.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/4.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/5.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/6.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/7.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/8.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/9.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/10.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/11.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/12.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/13.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/14.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/15.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/16.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/17.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/18.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/19.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/20.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/21.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/22.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/23.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/24.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/25.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/26.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/27.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/28.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/29.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/30.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/31.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/32.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/33.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/34.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/35.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/36.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/37.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/38.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/39.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/40.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/41.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/0.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/1.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/2.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/3.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/4.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/5.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/6.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/7.csv.gz
s3://airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/8.csv.gz

it seems while airbyte wrote the data into files at one certain moment it will repeat the number from 0 (ex: 0 ~ 35 and all in sudden 0 ~ 41 finally 0 ~ 8)
and I also check the files I have in S3 bucket to write into Redshift:

xxx-ooo@ip-172-29-0-35:~$ aws s3 ls airbyte-dev-data/source_dp_indicator_daily_test_20220517/2022_06_18_13_b8dc2c44-6686-4386-87ff-84b82b111ce4/
2022-06-19 06:35:23  209727805 0.csv.gz
2022-06-19 07:50:54      13034 0e670848-f88f-491d-b31c-c3b23b893c25.manifest
2022-06-19 06:45:49  209720864 1.csv.gz
2022-06-19 06:57:04  209729973 2.csv.gz
2022-06-19 07:07:06  209718819 3.csv.gz
2022-06-19 07:15:58  209726267 4.csv.gz
2022-06-19 07:24:52  209724788 5.csv.gz
2022-06-19 07:34:14  209735005 6.csv.gz
2022-06-19 07:43:20  209717309 7.csv.gz
2022-06-19 07:50:53  157915457 8.csv.gz

Seems there is only 8 files are wrote into S3 bucket in other word the record in manifest 0 ~ 35 and 0 ~ 41 are gone. Hence I think its the problem to raise this error.

Not sure if I miss anything or its a bug. Thank you in advanced for your help.

Hey @khungCU,
Thank you for your investigation! I have multiple question to help you find the root cause:

  • Could you please try to upgrade your destination redshift connector to 0.3.40?
  • Did you try to change the stream part size and observe if you have a different behavior?
  • Do you know if the duplicate 0.csv.gz file in the list refer to the same file? I’m thinking that another process could have written a different .gz archive to S3 with the same name. Are you syncing multiple tables? You should try to load a single table and check if you get the same problem. I would also suggest tracking the data loading on S3 and check if the 0.csv.gz remains with the same size.
  • Could you please try to upgrade your destination redshift connector to 0.3.40?
    Will do
  • Did you try to change the stream part size and observe if you have a different behavior?
    Could you tell me more of the behavior here? when it comes to large table should I increase this argument or decrease? once I update my connector to 0.3.40 argument stream part size not exist anymore
  • Do you know if the duplicate 0.csv.gz file in the list refer to the same file?
    It’s a bit hard to monitoring that is there any way I could identify if there are the same file?
  • Are you syncing multiple tables?
    Since its a big table I created its own connection.

Thank you for the details. Let me know if the error persists after the upgrade.

  1. It works well after upgrade the redshift connector to 0.3.40 .
  2. Could you tell me how is the settingstream part size would affect airbyte writing files into S3?

Thanks a lot for the help!!

It’s great that the upgrade solved your problem! The latest version does not expose the stream part size anymore and it’s managed internally by the connector. More detail here and there