Postgres Source - Slow initial Load

anthonyrlx · June 22, 2022, 2:10pm

We have a GKE deploy with 6 e2-standard-16 just to run some tests.

But we have had a very slow performance in these tests.

We have a very small table with 10 million records (± 2 gb ) in a postgres(latest source version and 0.39.21-alpha), our current performance is around 17 minutes.

As in our productive environment we have tables with 600 million records, this performance would make us have to run the job for days.

Any idea what could be changed in the settings?

In the current tests we increased the number of workers and the maximum number of simultaneous workers per kind of job (sync/discover etc), but I believe that this changes the parallelization and not the performance of a specific job?

alafanechere · June 23, 2022, 1:08pm

Hi @anthonyrlx,
You are not the first one to complain about our source Postgres connector throughput.
The bottleneck is currently on the connector side and increasing parallelization will have little effect if your cluster is sufficiently provisioned in terms of resources.
Improving our Postgres connector and databases connector, in general, is on top of our to-do list. You can check our public roadmap for details.
I suggest you subscribe to this Github issue to receive updates on the topic.
You can also find a related discussion on the forum (it’s for MySQL but it’s the exact same logic for Postgres).

marcosmarxm · July 13, 2022, 12:00am

Hi there from the Community Assistance team.
We’re letting you know about an issue we discovered with the back-end process we use to handle topics and responses on the forum. If you experienced a situation where you posted the last message in a topic that did not receive any further replies, please open a new topic to continue the discussion. In addition, if you’re having a problem and find a closed topic on the subject, go ahead and open a new topic on it and we’ll follow up with you. We apologize for the inconvenience, and appreciate your willingness to work with us to provide a supportive community.

Topic		Replies	Views
Slow data extraction performance from Connector Builder to Postgres database in Kubernetes cluster Connector Questions connector-builder , connector , kubernetes-cluster , postgres-database , performance-issue	1	10	July 5, 2024
Source MSSQL - initial load is very slow (CDC run) Connector Questions & Issues source-microsoft-sql-server-mssql , data-loading , connectors	3	1063	July 2, 2022
Timeline for Parallel Source Processing? Q&A performance	3	475	March 28, 2023
Optimized Postgres source connector performance Connector Questions & Issues source-postgres , data-loading , connectors	2	705	July 22, 2022
Airbyte ingestion slower after connector upgrade to 1.X Connector Questions & Issues source-postgres	14	993	September 29, 2022

Postgres Source - Slow initial Load

Related topics