Shopify to MySQL Slow

Seems very slow and get’s stuck on one table for hours.
looks like it’s processing a record per second.

Any way to speed this up?

  • Is this your first time deploying Airbyte: Yes
  • OS Version / Instance: Centos
  • Memory / Disk: 16Gb / 1Tb SSD
  • Deployment: Docker
  • Airbyte Version: 0.35.60-alpha
  • Source name: shopify
  • Destination name/version: Mysql
  • Step: On sync
  • Description: I’m tryin to replicate shopify to mysql
    logs-4.txt (115.1 KB)

Hi @Sig, after digging into the Shopify connector source code here’s my guess why this load is slow:

  • The inventory_item stream is a SubStream, which means it also queries the product stream to build inventory items.
  • For each inventory item, a call is made on the product endpoint which can introduce slowness and rate-limiting

I opened an issue here to suggest to use the CDK caching logic to improve the reading speed.

Thanks Augustin.
Over the night the Inventory did update I’m I’m seeing the completed table in mysql.

The Orders table is also slow - is this a substream too?
2 records per second (see log file)

Perhaps I’m doing something incorrectly with my setup. Do you know what I should check to speed things up?

2022-03-29 10_05_49-Airbyte _ Connections _ shopify airbyte2 → shopify

shopifylog-March-29-2022.txt (5.7 MB)

Looks like the first attempt failed. Not exactly sure why.
image

Looks like it’s repeating the same import?

I’ve attached the last part of the log file.
Perhaps you can shed some light on this situation.
Is there anything I should be doing?

Log March 29 evening 2022.txt (1.4 MB)

@Sig I think the problem is with normalisation can you just select refund transactions stream and run the sync?

Hey @Sig, unfortunately I don’t think we can find a way to speed up the load of the inventory items until Shopify improves this endpoint.
According to the main contributor reply on my issue:

  • Some caching is already in action to optimize the load of sub-streams
  • Shopify API does not offer an endpoint to retrieve a list of inventory items for a product with a single call, this is why the connector needs to make one API call per inventory items.
  • To respect Shopify rate-limiting we do not exceed 2 call per second. Retrieving an inventory items requires two call: one to the product endpoint to retrieve the product id, another one to the inventory item endpoint to get the item metadata. This why you observe that it takes 1 second to read one inventory items.

@alafanechere ok fair enough. Sounds like Shopify restrictions.

@harshith Ok doing it

This is what I did.

Running one stream “order_refunds”
Is that what you meant? I could not find a “refund transactions” stream.

Yes normalization is failing.
Any advise to make this work?

image

Attempt 1 and 2 log files

attempt 1.txt (503.7 KB)
attempt 2.txt (506.9 KB)

attempt 3 log file.
attempt 3.txt (506.9 KB)

Hey @sig,
It would have been better to open a new topic as this normalization issue is different from the original Shopify slowness topic.
Are you using MariaDB as your destination?
This is the error in normalization:

 MySQL error: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'json) as receipt

There is probably a SQL syntax difference between MySQL and MariaDB.
DBT normalization generates MySQL compatible SQL and it appears that your MariaDB server does not understand this syntax.