Summary
When updating the S3 Destination’s S3Path for each sync job, changes are not taking effect immediately on the server under load, even though the update API returns 200 OK.
Question
I’m running into an interesting issue with respect to the S3 Destination configuration. My workflow is to update the S3 Destination’s S3Path to some folder, and then sync the connection. This way, each specific manual sync job gets its own predictable path in S3. Locally this seems to work fine; but when I deploy to a real server and put it under load, it seems like the update is missed and a different path is used during the sync process - even though the destination update API returns 200 OK like it accepted the write. Is there some disconnect or delay with updates to S3 Destination and when those changes take effect for subsequent sync jobs?
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want to access the original thread.
Join the conversation on Slack
["s3-destination", "sync", "update", "path", "server-load"]
This keeps happening; which is very strange. I’m not sure how to troubleshoot; I was able to find a connection that is hooked up to a google ads source connector + an s3 destination connector: the source connector is pulling data from 2018-12-25 to 2018-12-30 and the destination S3 path has been set to ${NAMESPACE}/01HKAZ2XDJMPTS7B2BZ64NE6YW/20181225T040000Z-20181230T040000Z and yet, when I run this sync manually, i see from the logs it uploads to :
2023-12-07 23:08:02 destination > INFO i.a.i.d.s.S3StorageOperations(uploadRecordsToBucket):131 Successfully loaded records to stage airbyte-raw/e08bdcbe-93c0-451d-be37-10b92c78dc48/01HGV4RG6GSNRDK5B3DKY3JPT8/managed_connection/historic/01HH37CZV00Z3D5T9TVRV7P014/20211204T160000Z-20211209T160000Z/Orders/2023__12__07__1701989944622__ with 0 re-attempt(s)
I’m not even sure where it got that 20211204T160000Z-20211209T160000Z
date from in the target s3 path. Maybe an older version of the destination’s config?
Actually, even stranger perhaps is that when I run this manually it seems to sync a random source. the first job was syncing google ads, then the next job logs have bing ads and SourceAmazonSellerPartner … I’m not sure what’s happening.
the logs don’t make sense; its as if the job is syncing the wrong source->destination pair
hmm, i think i have a theory about what is going on; i have some state in s3 from a previous install that is confusing the jobs since the job ids are just sequential ids; will try to clear out all this state and start fresh and see if I continue to get this kind of interesting result