Summary
The user is asking if it is possible to perform an incremental sync from a Postgres database to an S3 Data Lake with partitioning by date. They provided an example of a source table partitioned by date and mentioned the desired format for storing data in S3.
Question
Is it possible Incremental sync from Postgres to S3 Data Lake with partition by date? For example:
I have source table:
(
dt timestamp not null,
key integer not null,
value integer,
constraint pk_partitioned_table_cdc
primary key (dt, key)
)
partition by RANGE (dt);```
And two partition with a few records:
`partitioned_table_cdc_y2024m04d01`
`partitioned_table_cdc_y2024m04d02`
And I want put data in S3 in Hive format
``` partitioned_table_cdc/year=2024/month=04/day=01/*.parquet
partitioned_table_cdc/year=2024/month=04/day=02/*.parquet```
<br>
---
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. [Click here](https://airbytehq.slack.com/archives/C021JANJ6TY/p1712125331600659) if you want to access the original thread.
[Join the conversation on Slack](https://slack.airbyte.com)
<sub>
["incremental-sync", "postgres", "s3-data-lake", "partition-by-date", "hive-format"]
</sub>