Salesforce connector - missing data after update to version 2.0.10

  • Is this your first time deploying Airbyte?: No
  • OS Version / Instance: Ubuntu
  • Memory / Disk: 12 GBs / 50 GBs
  • Deployment: Docker
  • Airbyte Version: 0.44.2
  • Source name/version: Salesforce 2.0.10
  • Destination name/version: BigQuery 1.2.20
  • Step: During the sync
  • Description: After the update to the source connector of Salesforce to version 2.0.10, we suddenly realized that some data was missing, without any errors. We checked, and we weren’t reaching any API limits.
    We did the following test case for the table Case, knowing via the API, we were getting 184848 rows for this case:
  1. Incremental + dedup history:
    1.1. First run
    Case_incremental_1.txt (69.6 KB)
    1.2. Second run
    Case_incremental_2.txt (58.2 KB)
    1.3. Third run
    Case_incremental_3.txt (57.4 KB)

In total, we were getting ~29k records only

  1. Then we reset the table, and tried Full-refresh - Overwrite:
    Case_full_refresh_overwrite.txt (62.6 KB)

So, same. Around ~29k

  1. Then, downgraded the version of the source connector to version 2.0.9, config Full-refresh - Overwrite:
    Case_version_2.0.9.txt (137.3 KB)

Now it looks correct: ~193.8k rows

And yes, I’ve checked the start_date in all of these trials and versions.

I am also seeing a lot of issues with recent Salesforce connectors.

With 2.0.9 I’m seeing around 2% of records being duplicated once copied to our Postgres database. This seems like it could be related to Source Salesforce: creates duplicates after update · Issue #20471 · airbytehq/airbyte · GitHub

With the most recent connector, 2.0.12, we’re seeing many records go missing. Tables which should have ~1M+ records end up with ~200,000k records instead.