Syncing AWS RDS Postgres to Self-Managed Postgres on EC2

Summary

The user is experiencing a long-running sync process from an AWS RDS Postgres source to a self-managed Postgres destination on EC2. The user is concerned about a specific update query running for 15 hours and the growing size of the destination table. They are seeking help to determine the progress and duration of the sync.


Question

Hi all,
I am using airbyte to sync a table from a aws rds postgres source to a self managed postgres destination on ec2. The table size is around 18GB.
The sync has been running since 24 hours.
From what i have gathered from the logs as well as querying the destination postgres, the below query seems to be running since the past 15 hours.

	"airbyte_internal"."event_hub_raw__stream_wh_events"
set
	"_airbyte_loaded_at" = current_timestamp
where
	("_airbyte_loaded_at" is null
		and "_airbyte_extracted_at" > '2024-05-16T14:20:06.335Z');```
where wh_events is the table name i am trying to sync.

the only thing i can currently observe is the growing size of the table event_hub_raw__stream_wh_events through the below query:
```select
	relname as table_name,
	pg_total_relation_size(relid) as total_size
from
	pg_catalog.pg_statio_user_tables
where
	relname in ('event_hub_raw__stream_wh_events', 'wh_events')
order by
	pg_total_relation_size(relid) desc;```
Output:

```table_name	                     total_size
event_hub_raw__stream_wh_events	 64,780,517,376
wh_events	                     21,823,627,264```
the size for wh_events has settled, but the size of event_hub_raw__stream_wh_events keeps on constantly climbing since the past 15 hours.

Can someone help me out in figuring how long would it take or whether even the sync is happening as its supposed to.

<br>

---

This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. [Click here](https://airbytehq.slack.com/archives/C021JANJ6TY/p1716026938810419) if you want 
to access the original thread.

[Join the conversation on Slack](https://slack.airbyte.com)

<sub>
["syncing", "aws-rds-postgres", "self-managed-postgres", "ec2", "long-running", "update-query", "table-size", "sync-progress"]
</sub>

just an update if anyone is looking for something similar, the update query finally completed and the sync finished successfully after about 24 hours