Summary
The user is troubleshooting a situation where a paginated stream with PageNumber and PageSize set is not syncing all expected records to Snowflake destination. The sync runs successfully without errors but only about half of the expected records are found in the resulting table.
Question
Hi all, I’m troubleshooting a situation where I’ve configured a paginated stream which uses both PageNumber and PageSize. I’ve set the page size to 1000 records and am starting with page 0. The sync runs and completes “successfully” without errors. The destination is Snowflake, and when I check the resulting table and I only find about half records that I’m expecting (~300k of the ~600k). I am using the Full Sync | Override sync mode.
I’ve verified the number of pages expected using Postman. There are warnings in the job log, but no errors. What am I missing here?
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want to access the original thread.
Join the conversation on Slack
["paginated-stream", "snowflake-destination", "sync-issue", "page-number", "page-size", "full-sync", "override-sync"]
Hmm, I don’t know enough to give you a confident answer yet. However, based on the information provided, it seems like there might be an issue with the sync process. There are several reasons why not all records are being synced. It could be due to an error in the source connector, an issue with the destination connector, or a problem with the sync mode configuration. In some of the extracts, users have reported similar issues where not all records are being synced. For example, in this forum post, a user reported that only part of the data was being synced from Postgres to Snowflake. The sync would abort partway through with an error message, but then claim success. The user tried different modes of synchronization but got the same result each time. In this Github issue, a user reported that in a full replace sync, final tables were replaced with empty temp tables on source error. The source would stop fetching new data after an error, and the destination would assume that all data that should be loaded had been fetched. It would then replace all destination tables with their temp counterparts, whether empty or not. Without more specific information about your situation, it’s hard to provide a definitive answer. I would recommend checking the job logs for any warnings or errors that might indicate what’s going wrong. You might also want to try different sync modes to see if that makes a difference. For more information, you can refer to the Airbyte documentation or ask for help in the Airbyte forum or Slack channel.