This service is the internal database used by Airbyte to manage all connectors, states, jobs and connections.
You can split the database into two:
-
Configs: store data about connectors (sources and destination) and the information about the instance (workspace id, migrations
-
Jobs: anything related to the sync jobs
In this module Airbyte uses two tools: jOOQ and Flyway (which we talked previously)
jOOQ, is a library that helps developers interact with databases in a more convenient and type-safe way using the Java programming language. It provides a way to write database queries using Java code instead of writing raw SQL statements. In simple terms, jOOQ acts as a bridge between your Java code and the database. It allows you to define your database schema in Java classes, representing tables, columns, and relationships. This helps in maintaining a clear and structured representation of the database structure in your code.
Databases and Tables
There are a lot of tables inside the airbyte-db
weβre going to talk about the most important ones. To extract this result I have a local setup and access it using Docker with following command:
docker exec -ti airbyte-db psql -U docker airbyte
\d
will output all tables
\d actor
give all the columns to the specified table
Schema | Name | Type | Database |
---|---|---|---|
public | active_declarative_manifest | table | configs |
public | actor | table | configs |
public | actor_catalog | table | configs |
public | actor_catalog_fetch_event | table | configs |
public | actor_definition | table | configs |
public | actor_definition_config_injection | table | configs |
public | actor_definition_workspace_grant | table | configs |
public | actor_oauth_parameter | table | configs |
public | airbyte_configs | table | configs |
public | airbyte_configs_migrations | table | configs |
public | airbyte_jobs_migrations | table | |
public | airbyte_metadata | table | jobs |
public | attempts | table | jobs |
public | connection | table | configs |
public | connection_operation | table | configs |
public | connector_builder_project | table | configs |
public | declarative_manifest | table | configs |
public | jobs | table | jobs |
public | normalization_summaries | table | jobs |
public | notification_configuration | table | configs |
public | operation | table | configs |
public | secrets | table | |
public | state | table | configs |
public | stream_reset | table | configs |
public | stream_stats | table | jobs |
public | sync_stats | table | jobs |
public | workspace | table | configs |
public | workspace_service_account | table | configs |
Airbyte Database Diagram
This diagram was generated using Airbyte 0.44.5 version with DBeaver tool.
actor: stores all info about connectors created during the operation
actor_catalog: store the schema for source connectors
actor_definition: store the Airbyte connector catalog and their definitions
connection: all data about the connection created
jobs: store data about all jobs, if they succeeded or failed
attempts: each job will trigger some attempts
state: stores all states row individualy, which makes easier to read
operation: Airbyte calls Normalization and other tasks as operation.
secrets: OSS basic structure to store secrets
Guides
How to import/export Airbyte to a new instance (docker to docker deploy)
How to restore a deleted connection
Additional Resources:
-
Airbyte documentation about the database
-
(some thoughts) Is there a better way to keep the documentation updated reflect the current tables and columns?