Hive Schema deletion is very slow. can hive execute drop table statements in parallel on the database?

658 Views Asked by At

I have hive database with more than 2000 tables in it. when i am tying to drop entire database using "DROP DATABASE IF EXISTS MyDb CASCADE", It is taking more than 6.5 hours of time, depending data size as metadata about partitions increases.

I cant directly connect to hive-metastore ( postgress in our case) drop database as we are restricted for that.

So I started looking at option of dropping tables in parallel, using threads. I can see my threads are spawning, but Hive is deleting tables one by one. and it is taking exactly same amount of time as before.

But when i am using multithreaded code to create tables it is completing in much lesser time. compared to non thread code.

I have set hive.support.concurrency to true.

Is it by design Hive takes drop table statements one by one on database.

It is Hortonworks cluster Hive version : Apache Hive 1.2.1 spark version : 2.3.2

It is equivalent of Hortonworks sandbox 2.6.5 environment

0

There are 0 best solutions below