Deleting data from Hive managed table (Partitioned and Bucketed)

437 Views Asked by At

We have a hive managed table (its both partitioned and bucketed, and transaction = 'true'). We are using Spark (version 2.4) to interact with this hive table.

We are able to successfully ingest data into this table using following;

sparkSession.sql("insert into table values(''))

But we are not able to delete a row from this table. We are attempting to delete using below command;

sparkSession.sql("delete from table where col1 = '' and col2 = '')

We are getting operationNotAccepted exception.

Do we need to do anything specific to be able to perform this action?

Thanks

Anuj

1

There are 1 best solutions below

0
thebluephantom On

Unless DELTA table, this is not possible.

ORC does not support delete for Hive bucketed tables. See https://github.com/qubole/spark-acid

HUDI on AWS could also be an option.