How does Airbyte handle load massive volume of data?

How does Airbyte handle extracting data in high load massive volume?
Does it auto-scale up/down?
How to control it?

I read from the docs:

Airbyte allows scaling sync workloads horizontally using Kubernetes.

Is that related?

Hey, we do have a k8s solution in beta which can scale both the workers and sync pods based on the load. Is this is same one you are looking for?

Thanks @harshith ,
to understand if that is what I’m looking for - can you please elaborate about:

  • Are those workers include the components that do the data extraction?
  • Is it auto-scale both up & down?
  • Does the auto scale can be triggered by data volume?

@datayoshi you can go through this doc for more understanding of the worker

Otherwise

  1. Sync pods are created to fetch data and are deleted once the data sync is done
  2. What do you mean by data volume?

We have the basic k8s deployment module in our repo but if you are looking to scale up/down by some metrics you can also explore GKE custom metrics which can be added to the k8s charts.

Thanks for the reference, I’ve reviewed it.

What is the trigger that scales an Airbyte worker?

@datayoshi we don’t have it already configured but you can refer to this to choose the method Scaling Airbyte | Airbyte Documentation