Replicating data from 100 schemas with lag of 5 minutes

Summary

The user wants to replicate data from 100 schemas with a lag of 5 minutes using Airbyte for a demo. They have questions about the recommended EC2 machine size and the usage allowed by the open-source license.


Question

Hi :slightly_smiling_face:
I want to do a demo in my company to replicate data from about 100 schemas (total of about 3mb/s) with a lag of 5 minutes, so we can potentially use Airbytes in the future.

I have some questions (please point me to another room if this is not the correct one):

  1. What size of an EC2 machine is recommended for this load?
  2. Does the opensource lisence allow this usage?


This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

["replicate-data", "100-schemas", "lag-5-minutes", "EC2-machine", "opensource-license"]

Hi <@U01MMSDJGC9> thank you for your answer!

  1. The tables are hundreds of millions (if not sometimes billions) of records large, I want to incrementally update them using and I am using CDC. I am replicating from MySQL to Snowflake.
  2. Thank you, is there a way I can know the pricing model for enterprise support before I go with a recommendation to my superiors to contact sales?
  1. It will depend of how large are the table. Are all tables have cursors to execute incremental syncs? Are you using CDC?
  2. Yes, considering the info you shared here as youโ€™re using to ingest data from your company