Intercom is slow

Why intercom is very slow to get data after i upgrade to 0.35.2-alpha
logs-242-0.txt (19.7 KB)

Hey @ayyoub,
Could you please try to downgrade the intercom connector version to 0.1.16 and check if the read speed is different?

Thank you!

already upgrade Airbyte again to v0.36.4-alpha and use intercom connector 0.1.16 but still slow even i just get 2 days from start date on conversations

I don’t think this can be related to your Airbyte version. It’s more likely to be a delay introduced by Intercom for throttling requests.
Could you please share the logs of a sync with 0.35.2 to check how fast it was before your upgrade?

We have two open issues to improve Intercom’s performance, I suggest you subscribe to these to receive updates once we work on this.

1 Like

after running 3 days but worker canceled accidentally @alafanechere
logs-540.txt (2.7 MB)

Hey @ayyoub,
I had the chance to chat with other intercom users and they have similar loading speeds for the conversation stream.
Feel free to reach the intercom support to understand if some throttling is happening on their side.
Airbyte’s sync has a maximum duration of three days, this is why your sync stopped.
You can change the value of the SYNC_JOB_MAX_TIMEOUT_DAYS environment variable in your .env file, the default value is 3, so please increase to a superior value.

oke got it thanks @alafanechere

should i restar airbyte after changed the variable ? @alafanechere

Yes, the worker and the scheduler at least.

whats the endpoint airbyte used for airbyte to get intercom conversations ? @alafanechere

source-intercom uses this endpoint for the Conversations stream: https://api.intercom.io/conversations
You can find more details about how the connector is implemented here

1 Like

this is new case again i think, after 15 hours running intercom contact force cancel its worker @alafanechere
logs-1679 (1).txt (653.8 KB)

This behavior is interesting:
The job is considered as succeeded according to the logs:

2022-05-06 06:00:05 e[36mDEBUGe[m i.a.s.a.JobSubmitter(lambda$submitJob$2):135 - Job id 1679 succeeded2022-05-06 06:00:05 e[36mDEBUGe[m i.a.s.a.JobSubmitter(lambda$submitJob$4):166 - Job id 1679 cleared

The sources emitted records but these records are not committed to the destinations

recordsEmitted=915851,bytesEmitted=1907436593,stateMessagesEmitted=0,recordsCommitted=0

I’m under the impression the source connector read all the data on your streams (~915000), do you think it’s the case?

Maybe the error is on the destination side now. Could you please upgrade your BigQuery connector to the latest version and chose the GCS upload mode?

gcs upload mode ? then i move from gcs to BQ using BQ Data Transfer ? @alafanechere
should i manually copy file in GCS to BQ table ?

After the first run success, then i schedule every 2 hours, but this happen, Why Setting docker job mdc happen a lot of times ? @alafanechere
logs-2058.txt (29.2 KB)

gcs upload mode ? then i move from gcs to BQ using BQ Data Transfer ?

No, I meant using the GCS Staging loading method in the BigQuery connector. I think this is what you did according to your latest logs

Why Setting docker job mdc happen a lot of times ?

These are only info logs, it’s not an error.

I suspect that the contact stream in incremental mode has a problem. Could you please try to load it in full-refresh and check if subsequent jobs work?

i already run with incremental, but why it still load whole data from start_date

I’m asking if you could try to load with full-refresh, not incremental, it will indeed load the full data. I’m interested in knowing if the root cause of your problem comes from the incremental mode on the contact stream. If the full refresh works as expected it will confirm my intuition.

i’m already try full refresh but still get the data from start_date and it tooks a lot of time

Do you mind trying the following please:

  1. Set a recent start date so that the amount of data does not take a lot of time to replicate.
  2. An all full refresh sync: admins , contacts and tags stream.
  3. admins (full refresh), contacts (incremental) and tags (full refresh) stream.

If step 3 fails it means we narrowed the error to the incremental management on stream contacts.