- Is this your first time deploying Airbyte?: No
- OS Version / Instance: Digital Ocean 16 GB Memory / 320 GB Disk / LON1 - Ubuntu 20.04 (LTS) x64
- Memory / Disk: 16 GB Memory / 320 GB Disk
- Deployment: Docker
- Airbyte Version: 0.46.26
- Source name/version: Posthog 0.1.8
- Destination name/version: Snowflake
- Step: Sync
- Description: I’ve tested different ways to scale up my instance like changing instance type, increasing number of workers etc but nothing seems to affect the speed at which records are fetched from the source . At current speed around 120 records per second, we cannot reasonibly keep up with how much records are created. Is there something we could change in the connector set up to increase pagination or batch size ? It currently fetches around 1k records every 10 sec or 200KB/s. Which seems very low to me.
Here is below the sum up of my attempts :
"totalStats" : {
"recordsEmitted" : 10423,
"bytesEmitted" : 20604648,
"sourceStateMessagesEmitted" : 6,
"destinationStateMessagesEmitted" : 1,
"recordsCommitted" : 10423,
"meanSecondsBeforeSourceStateMessageEmitted" : 50,
"maxSecondsBeforeSourceStateMessageEmitted" : 101,
"maxSecondsBetweenStateMessageEmittedandCommitted" : 112,
"meanSecondsBetweenStateMessageEmittedandCommitted" : 112,
"replicationStartTime" : 1671557994215,
"replicationEndTime" : 1671558111458,
"sourceReadStartTime" : 1671557994280,
"sourceReadEndTime" : 1671558103216,
"destinationWriteStartTime" : 1671557994336,
"destinationWriteEndTime" : 1671558111457
total = 108936 source read duration in ms
Speed = 0.0956 rec/ms -> 95rec/sec
Size Speed = 0.000237MB/ms -> 0.23 Mo/s
}
10 Workers - 16 GB Memory / 320 GB Disk / LON1 - Ubuntu 20.04 (LTS) x64
"totalStats" : {
"recordsEmitted" : 27876,
"bytesEmitted" : 50633892,
"sourceStateMessagesEmitted" : 6,
"destinationStateMessagesEmitted" : 1,
"recordsCommitted" : 27876,
"meanSecondsBeforeSourceStateMessageEmitted" : 125,
"maxSecondsBeforeSourceStateMessageEmitted" : 250,
"maxSecondsBetweenStateMessageEmittedandCommitted" : 265,
"meanSecondsBetweenStateMessageEmittedandCommitted" : 265,
"replicationStartTime" : 1671558522403,
"replicationEndTime" : 1671558794175,
"sourceReadStartTime" : 1671558522450,
"sourceReadEndTime" : 1671558783591,
"destinationWriteStartTime" : 1671558522507,
"destinationWriteEndTime" : 1671558794175
total = 261141 source read duration in ms
Speed = 0.106 rec/ms -> 106 rec/sec
Size Speed = 0.000237MB/ms -> 0.23 Mo/s
}
10 Workers - 64 GB Memory / 320 GB Disk / LON1 - Ubuntu 20.04 (LTS) x64
"totalStats" : {
"recordsEmitted" : 54378,
"bytesEmitted" : 102999115,
"sourceStateMessagesEmitted" : 6,
"destinationStateMessagesEmitted" : 1,
"recordsCommitted" : 54378,
"meanSecondsBeforeSourceStateMessageEmitted" : 70,
"maxSecondsBeforeSourceStateMessageEmitted" : 419,
"maxSecondsBetweenStateMessageEmittedandCommitted" : 441,
"meanSecondsBetweenStateMessageEmittedandCommitted" : 441,
"replicationStartTime" : 1671560005379,
"replicationEndTime" : 1671560454173,
"sourceReadStartTime" : 1671560005412,
"sourceReadEndTime" : 1671560440014,
"destinationWriteStartTime" : 1671560005454,
"destinationWriteEndTime" : 1671560454171
total = 434602 source read duration in ms
Rec Speed = 0.125 rec/ms -> 125 rec/sec
Size Speed = 0.000237MB/ms -> 0.23 Mo/s```