Summary
User reports a TABLE_OR_VIEW_ALREADY_EXISTS error during incremental sync from Postgres to Databricks using version 3.3.0 of the Databricks destination. The issue occurs after the first successful sync, preventing further data synchronization.
Question
Hi, we’ve got an issue with the databricks destination v3.3.0. We set up an incremental sync for a table from a postgres source into databricks with unity catalog. It succeeds for the first sync, but on subsequent (incremental) syncs, we get a TABLE_OR_VIEW_ALREADY_EXISTS error. We’ve seen a similar thread https://airbytehq.slack.com/archives/C07QGDKNQ9F/p1723675851807649|here that was apparently fixed, so maybe it’s not the exact same issue. Will post more of the error in the thread. Any help would be appreciated, as it’s currently stopping us from being able to sync data! Thanks in advance
This topic has been created from a Slack thread to give it more visibility.
It will be on Read-Only mode here. Click here if you want
to access the original thread.
Join the conversation on Slack
['databricks', 'incremental-sync', 'postgres', 'error', 'TABLE_OR_VIEW_ALREADY_EXISTS']
java.util.concurrent.CompletionException: java.sql.SQLException: [Databricks][JDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: 42P07, Query: CREATE TA***, Error message from Server: org.apache.hive.service.cli.HiveSQLException: Error running query: [TABLE_OR_VIEW_ALREADY_EXISTS] org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: [TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view `airbyte_dev`.`events` because it already exists. Choose a different name, drop the existing object, add the IF NOT EXISTS clause to tolerate pre-existing objects, add the OR REPLACE clause to replace the existing materialized view, or add the OR REFRESH clause to refresh the existing streaming table. SQLSTATE: 42P07 at org.apache.spark.sql.hive.thriftserver.HiveThriftServerErrors$.runningQueryError(HiveThriftServerErrors.scala:49) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.$anonfun$execute$1(SparkExecuteStatementOperation.scala:805) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:51) at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:104) at <http://org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org|org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org>$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:641) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$5(SparkExecuteStatementOperation.scala:486) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.execution.SQLExecution$.withRootExecution(SQLExecution.scala:711) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:486) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:48) at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:276) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:272) at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:46) at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:43) at com.databricks.spark.util.PublicDBLogging.withAttributionContext(DatabricksSparkUsageLogger.scala:27) at com.databricks.logging.AttributionContextTracing.withAttributionTags(AttributionContextTracing.scala:95) at com.databricks.logging.AttributionContextTracing.withAttributionTags$(AttributionContextTracing.scala:76) at com.databricks.spark.util.PublicDBLogging.withAttributionTags(DatabricksSparkUsageLogger.scala:27) at com.databricks.spark.util.PublicDBLogging.withAttributionTags0(DatabricksSparkUsageLogger.scala:74) at com.databricks.spark.util.DatabricksSparkUsageLogger.withAttributionTags(DatabricksSparkUsageLogger.scala:175) at com.databricks.spark.util.UsageLogging.$anonfun$withAttributionTags$1(UsageLogger.scala:617) at com.databricks.spark.util.UsageLogging$.withAttributionTags(UsageLogger.scala:729) at com.databricks.spark.util.UsageLogging$.withAttributionTags(UsageLogger.scala:738) at com.databricks.spark.util.UsageLogging.withAttributionTags(UsageLogger.scala:617) at com.databricks.spark.util.UsageLogging.withAttributionTags$(UsageLogger.scala:615) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withAttributionTags(SparkExecuteStatementOperation.scala:71) at org.apache.spark.sql.hive.thriftserver.ThriftLocalProperties.$anonfun$withLocalProperties$12(ThriftLocalProperties.scala:234) at com.databricks.spark.util.IdentityClaim$.withClaim(IdentityClaim.scala:48) at org.apache.spark.sql.hive.thriftserver.ThriftLocalProperties.withLocalProperties(ThriftLocalProperties.scala:229) at org.apache.spark.sql.hive.thriftserver.ThriftLocalProperties.withLocalProperties$(ThriftLocalProperties.scala:89) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:71) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:463) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:449) at java.base/java.security.AccessController.doPrivileged(AccessController.java:712) at java.base/javax.security.auth.Subject.doAs(Subject.java:439) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:499) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840) Caused by: org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: [TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view `airbyte_dev`.`events` because it already exists. Choose a different name, drop the existing object, add the IF NOT EXISTS clause to tolerate pre-existing objects, add the OR REPLACE clause to replace the existing materialized view, or add the OR REFRESH clause to refresh the existing streaming table.