Issue with Incremental Sync to Databricks Destination

slack-user-airbyte · December 9, 2024, 5:02pm

Summary

User reports a TABLE_OR_VIEW_ALREADY_EXISTS error during incremental sync from Postgres to Databricks using version 3.3.0 of the Databricks destination. The issue occurs after the first successful sync, preventing further data synchronization.

Question

Hi, we’ve got an issue with the databricks destination v3.3.0. We set up an incremental sync for a table from a postgres source into databricks with unity catalog. It succeeds for the first sync, but on subsequent (incremental) syncs, we get a TABLE_OR_VIEW_ALREADY_EXISTS error. We’ve seen a similar thread https://airbytehq.slack.com/archives/C07QGDKNQ9F/p1723675851807649|here that was apparently fixed, so maybe it’s not the exact same issue. Will post more of the error in the thread. Any help would be appreciated, as it’s currently stopping us from being able to sync data! Thanks in advance

This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.

Join the conversation on Slack

_{['databricks', 'incremental-sync', 'postgres', 'error', 'TABLE_OR_VIEW_ALREADY_EXISTS']}

slack-user-airbyte · December 12, 2024, 3:27pm

java.util.concurrent.CompletionException: java.sql.SQLException: [Databricks][JDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: 42P07, Query: CREATE TA***, Error message from Server: org.apache.hive.service.cli.HiveSQLException: Error running query: [TABLE_OR_VIEW_ALREADY_EXISTS] org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: [TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view `airbyte_dev`.`events` because it already exists. Choose a different name, drop the existing object, add the IF NOT EXISTS clause to tolerate pre-existing objects, add the OR REPLACE clause to replace the existing materialized view, or add the OR REFRESH clause to refresh the existing streaming table. SQLSTATE: 42P07 at org.apache.spark.sql.hive.thriftserver.HiveThriftServerErrors$.runningQueryError(HiveThriftServerErrors.scala:49) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.$anonfun$execute$1(SparkExecuteStatementOperation.scala:805) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:51) at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:104) at <http://org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org|org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org>$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:641) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$5(SparkExecuteStatementOperation.scala:486) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.execution.SQLExecution$.withRootExecution(SQLExecution.scala:711) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:486) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:48) at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:276) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:272) at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:46) at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:43) at com.databricks.spark.util.PublicDBLogging.withAttributionContext(DatabricksSparkUsageLogger.scala:27) at com.databricks.logging.AttributionContextTracing.withAttributionTags(AttributionContextTracing.scala:95) at com.databricks.logging.AttributionContextTracing.withAttributionTags$(AttributionContextTracing.scala:76) at com.databricks.spark.util.PublicDBLogging.withAttributionTags(DatabricksSparkUsageLogger.scala:27) at com.databricks.spark.util.PublicDBLogging.withAttributionTags0(DatabricksSparkUsageLogger.scala:74) at com.databricks.spark.util.DatabricksSparkUsageLogger.withAttributionTags(DatabricksSparkUsageLogger.scala:175) at com.databricks.spark.util.UsageLogging.$anonfun$withAttributionTags$1(UsageLogger.scala:617) at com.databricks.spark.util.UsageLogging$.withAttributionTags(UsageLogger.scala:729) at com.databricks.spark.util.UsageLogging$.withAttributionTags(UsageLogger.scala:738) at com.databricks.spark.util.UsageLogging.withAttributionTags(UsageLogger.scala:617) at com.databricks.spark.util.UsageLogging.withAttributionTags$(UsageLogger.scala:615) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withAttributionTags(SparkExecuteStatementOperation.scala:71) at org.apache.spark.sql.hive.thriftserver.ThriftLocalProperties.$anonfun$withLocalProperties$12(ThriftLocalProperties.scala:234) at com.databricks.spark.util.IdentityClaim$.withClaim(IdentityClaim.scala:48) at org.apache.spark.sql.hive.thriftserver.ThriftLocalProperties.withLocalProperties(ThriftLocalProperties.scala:229) at org.apache.spark.sql.hive.thriftserver.ThriftLocalProperties.withLocalProperties$(ThriftLocalProperties.scala:89) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:71) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:463) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:449) at java.base/java.security.AccessController.doPrivileged(AccessController.java:712) at java.base/javax.security.auth.Subject.doAs(Subject.java:439) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:499) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840) Caused by: org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: [TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view `airbyte_dev`.`events` because it already exists. Choose a different name, drop the existing object, add the IF NOT EXISTS clause to tolerate pre-existing objects, add the OR REPLACE clause to replace the existing materialized view, or add the OR REFRESH clause to refresh the existing streaming table.

Topic		Replies	Views
Issue with Databricks destination v3.3.0 incremental sync Connector Questions connector , incremental-sync , bug , databricks-destination , postgres-source	0	0	November 23, 2024
Error with destination-databricks after initial sync Connector Questions destination-databricks , connector , question , initial-sync , tablealreadyexistsexception	1	7	August 15, 2024
Error during data sync from HubSpot to Databricks Connector Questions connector , bug , sync-error , hubspot , databricks	0	4	December 9, 2024
Error when syncing data from HubSpot to Databricks Connector Questions connector , error , question , hubspot , databricks	0	4	November 13, 2024
Error when syncing data to a postgres destination with views created Connector Questions connector , incremental-sync , question , sync-error , postgres-destination	2	85	May 16, 2024

Issue with Incremental Sync to Databricks Destination

Summary

Question

Related topics