Why does destination S3 generate one or mutliple files for one postgre table?

marc-mark · June 20, 2023, 7:51am

Is this your first time deploying Airbyte?: No
OS Version / Instance: Helm deployment on K8S
Memory / Disk: Not limited (auto scaling of K8S nodes)
Deployment: Kubernetes
Airbyte Version: 0.50.3
Source name/version: Postgres 2.0.33
Destination name/version: S3 0.4.1
Step: The issue is happening during sync
Description:

We sync postgre tables into Snowflake, via an external amazon s3 stage (files are then ingested via snowpipe into Snowflake automatically).

We have got a one-one relationship so far, one table corresponding to one S3 file.

Recently, we add a new table stream in an existing connection containing already 9 streams.
This table contains only 3 rows and 4 columns (2 int, 2 datetime).

The issue we have is that the sync of this new table produces 2 files. The two file seems containing partitioned data by one column.

We do not understand why this table is split into this two files, whereas other tables are well sync in only one file.

What is strange is that if we add our new table stream to another postgre connection (with only two streams), it produces one file only, as we expect.

The sync into one file is mandatory for us, as we do a fullrefresh of the table and we want all table data to be ingested in snowflake at the same time (within the same file) to be able to detect deleted records without any delay.

We did not find anything in the connector documentation, and we need to understand how to setup Airbyte to guarantee a one-one relationship (one table must generate one file only).

Does anyone can explain this behaviour or point us t the right documentation?

Best regards

Topic		Replies	Views
Syncing multiple tables simultaneously in Airbyte to Snowflake Connector Development docker , platform , airbyte , question , config	0	51	May 14, 2024
Postgres to Snowflake only sync part of the data Connector Questions & Issues source-postgres , destination-snowflake , data-loading , connectors	3	464	August 11, 2022
Syncing Multiple Tables in Airbyte with Snowflake Connector Questions docker , mysql , airbyte , connector , postgres	0	3	November 28, 2024
Source Postgres - Sync completes without failure but not all streams get synced Connector Questions & Issues source-postgres , destination-s3 , data-loading	6	534	February 16, 2023
S3 to Postgres - read 0 records Connector Questions & Issues data-loading , connectors	0	341	May 31, 2023

Why does destination S3 generate one or mutliple files for one postgre table?

Related topics