Incremental sync from Postgres to S3 Data Lake with partition by date


The user is asking if it is possible to perform an incremental sync from a Postgres database to an S3 Data Lake with partitioning by date. They provided an example of a source table partitioned by date and mentioned the desired format for storing data in S3.


Is it possible Incremental sync from Postgres to S3 Data Lake with partition by date? For example:
I have source table:

    dt    timestamp not null,
    key   integer   not null,
    value integer,
    constraint pk_partitioned_table_cdc
        primary key (dt, key)
    partition by RANGE (dt);```
And two partition with a few records:

And I want put data in S3 in Hive format
```    partitioned_table_cdc/year=2024/month=04/day=01/*.parquet



This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. [Click here]( if you want to access the original thread.

[Join the conversation on Slack](

["incremental-sync", "postgres", "s3-data-lake", "partition-by-date", "hive-format"]

I tried using connector AWS Datalake v0.1.6 but didn’t get the results I wanted. On the AWS side I have Glue Data Catalog and AWS Lakeformation. And I get a non-partitioned table in .parquet with data duplication after Incremental Sync . If select option Lake Formation Governed Tables an error occurs -

Could not create table airbyte_test_mwIXw in database data_quality: InvalidInputException('An error occurred (InvalidInputException) when calling the CreateTable operation: Location for GOVERNED table is not registered.')