Summary
Error in Databricks destination syncs due to concurrency problem when multiple connections attempt to write data simultaneously. Issue related to race conditions in creating Delta table. Seeking help from others who may have encountered this issue.
Question
Databricks Lakehouse v3.2.1
Hello! We’ve been experiencing a recurrent error in our Databricks destination syncs. The error is:
Caused by: com.databricks.sql.transaction.tahoe.DeltaAnalysisException: [DELTA_MISSING_DELTA_TABLE_COPY_INTO] Table doesn't exist. Create an empty Delta table first using CREATE TABLE `<catalog>`.`<schema>`.`airbyte_check_test_table`.
...```
The issue appears to stem from a concurrency problem in the Databricks destination connector when multiple connections attempt to write data simultaneously. The current implementation checks for the existence of a `airbyte_check_test_table` Delta table and attempts to create it if it doesn’t exist.
However, this check and creation process is not synchronized, leading to race conditions where the table creation fails because another process is creating or has just created the table - the issue may be connected to <https://github.com/airbytehq/airbyte/blob/89e890d1f26db3a1888bb29278f36128e07a2d9e/airbyte-integrations/connectors/destination-databricks/src/main/kotlin/io/airbyte/integrations/destination/databricks/DatabricksDestination.kt#L101|this part> of the destination source code.
Has anyone else encountered this issue?
<br>
---
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. [Click here](https://airbytehq.slack.com/archives/C021JANJ6TY/p1725392409221289) if you want
to access the original thread.
[Join the conversation on Slack](https://slack.airbyte.com)
<sub>
["databricks-lakehouse", "destination-connector", "concurrency-problem", "delta-table", "race-conditions"]
</sub>