Clickhouse target connector - timeout

  • Is this your first time deploying Airbyte?: No
  • OS Version / Instance: Ubuntu
  • Memory / Disk: you can use something like 4Gb / 1 Tb
  • Deployment: Kubernetes
  • Airbyte Version: 0.35.63-alpha
  • Source name/version:
  • Destination name/version: Clickhouse 0.1.4
  • Step: during sync
  • Description: When loading data into clickhouse, we get an HTTP timeout. This is common with clickhouse JDBC. To avoid it, we need to adjust the driver parameter socket_timeout. How can I adjust it?
    Seems we should be able to add more parameter in 0.1.4 - looks like it was supposed to be added in the last change .

Thanks
Roy

Can you share one sync log with the errors @roy.roz ?

logs-13.txt (356.2 KB)
Good idea (: attached

P.S,
Why do you insert into a temp table, and then copy to another?
INSERT INTO airbyte_data._airbyte_raw_Case SELECT * FROM airbyte_data._airbyte_tmp_tlt_Case

To guarantee that when the sync if successful w’re going to transfer the correct data and if failed users don’t have corrupted data in the raw tables.

The error logs is a timeout in the clickhouse side, can you check the configuration of the database and see if tehre is any parameter to adjust?

Hi Marcos,

Got it! The timeout happens during that copy from the temp table to raw table.
Clickhouse jdbc in the current version uses HTTP, and that times out after a minute by default. That’s a known issue in clickhouse. The fix is to adjust the timeout paramrer on the jdbc driver side, as I requested in this ticket.
I can see you’ve added support for additional parameter to your jdbc type sources, but looks like you didnt add it to the spec so we cant use that yet.

Just to be clear, that’s how it should work:
jdbc:clickhouse://url:port/default?socket_timeout=300000"

You can read about it in this issue if you need more references:
https://github.com/ClickHouse/clickhouse-jdbc/issues/159

There is a work to enable custom JDBC parameters in all sources and destinations have this option. See here: Refactor to enable support for optional JDBC parameters for all JDBC destinations by girarda · Pull Request #10421 · airbytehq/airbyte · GitHub, you can follow the issue Destination Clickhouse: custom JDBC parameters · Issue #10717 · airbytehq/airbyte · GitHub that will implement for Clickhouse destination.