- Is this your first time deploying Airbyte?: No
- OS Version / Instance: Ubuntu
- Memory / Disk: you can use something like 4Gb / 1 Tb
- Deployment: Kubernetes
- Airbyte Version: 0.39.25
- Source name/version: Mailchimp
- Destination name/version: Json destination
- Step: Run the Mailchimp source with the Json destination with more than 700 email events
When I run the Mailchimp source to the JSON destination and compare it with what’s returned from the Mailchimp API’s, the output JSON is missing a lot of data. I spot-checked one campaign and found 61 events in the JSON output and the Mailchimp email_activity endpoint returns around 707.
Hello, @murph! Could you please show me the Airbyte/server logs so I can see if you are getting an errors?
We aren’t getting any errors but we were able to debug this a bit. It looks like every time you paginate the email activity endpoint you’re also incrementing the
since param: airbyte/streams.py at bfa54aca50115770530ca6fdff24d4125541d23b · airbytehq/airbyte · GitHub. Via the cursor_field: airbyte/streams.py at bfa54aca50115770530ca6fdff24d4125541d23b · airbytehq/airbyte · GitHub which is the timestamp of the newest record: airbyte/streams.py at bfa54aca50115770530ca6fdff24d4125541d23b · airbytehq/airbyte · GitHub
That means that when we do an incremental sync we lose a lot of records. The records returned from the Mailchimp API are NOT sorted by timestamp, so the timestamp selection is completely arbitrary. I don’t think this is the intended behavior?
Looks like you cannot sort what’s returned from the email activity endpoint so this kind of checkpointing wont work https://mailchimp.com/developer/marketing/api/email-activity-reports/list-email-activity/
Thanks for digging into this - you are right, this is definitely not the intended behavior. I’ve opened an issue on Github, I or another team member will start work on this soon!
Thank you for creating that issue! I just wanted to check in to see when the issue will be prioritized?
@murph sorry for the wait, we have a few team members out this week. I asked one of my colleagues to set aside some time for this issue, so you’ll be hearing something soon!