Salesforce limit records to 30k per stream

  • Is this your first time deploying Airbyte?: Yes
  • OS Version / Instance: Windows
  • Memory / Disk: you can use something like 4Gb / 100 Gb
  • Deployment: Docker
  • Airbyte Version: 0.36.2
  • Source name/version: Salesforce
  • Destination name/version: Postgres
  • Step: During Sync
  • Description: I setup the OpportunityContactRole and OpportunityLineItem objects to be pulled for full refresh and overwrite and the sync pulled exactly 30,000 records for each. I then reset these to be incremental and ran the sync twice with the first sync pulling the same number of records and the next sync pulling the remainder. It seems to me that the issue may be with the next_page_token method in the BulkSalesforceStream class, but I am not familiar with the Salesforce API.

When you changed to incremental the number of records are wrong? Do you have some numbers?

After the the second incremental sync was run the number of records in the Postgres destination matched the number that I was expecting to see for the specified Salesforce objects. I added in some additional objects to pull from Salesforce with Full Refresh | Overwrite set as the sync mode, and have had some be able to pull more than the default page size (Contacts) and others capped at that page size. I believe that setting my sync mode for all of the objects I want to pull to incremental should solve this issue, but I just wanted to bring attention to the issue that I had with the sync mode for some of the objects.

Thanks Kris, I’ll open an issue to further investigation. Please let me know if you see any discrepancy between data you’re syncing.

Hi @marcosmarxm, did you find anything in your further investigation?

This issue is affecting me for Incremental or Full overwrite…i’ve yet to find a method that will enable me to pull all records. Continuing to troubleshoot to provide additional details, wanted to see if you’d discovered anything?

What version of Salesforce are you using Chance Barkley?

@marcosmarxm Sales Cloud Enterprise API v54

Sorry, Airbyte connector version

My bad @marcosmarxm 1.06

I just realized how far behind I was on the platform version. Trying an upgrade of platform and connector to the latest now to see if that makes a difference. Will report back.

[Update]

  • Is this your first time deploying Airbyte?: No
  • OS Version / Instance: Amazon Linux 2 on EC2
  • Memory / Disk: r5.large 100GiB ssd
  • Deployment: Docker
  • Airbyte Version: 0.39.7
  • Source name/version: Salesforce 1.09
  • Destination name/version: Snowflake 4.28
  • Step: Sync Outcome

Issue remains after upgrade. As you can see, it appears to read >56k rows, but then at the end limits it to 30k.

Could you try latest 1.0.9 I know a few versions ago there was a bug in pagination.

@marcosmarxm , I tried the latest without luck. See edited comment above for details. Please let me know if you have any questions or other suggestions.

Thanks @johnlafleur for the assist in unlocking comments.

Hello Chance, this issue was solved in Source Salesforce: fix sync capped streams with more records than page size by marcosmarxm · Pull Request #13658 · airbytehq/airbyte · GitHub
You must update to Salesforce 1.0.10 I was able to replicate the issue and submit a fix.