How bucketing works for Hive Acid Tables?

91 Views Asked by Vinit89 At 20 February 2023 at 12:44

In Hive, I understand how bucketing works for External Tables and Non Acid Managed tables.Based on the column that is specified inside clustered-by clause in the corresponding DDL statement, bucket is identified for corresponding row and that data is inserted into that relevant directory on the HDFS.

For Hive ACID Tables, I checked the directory structure of tables and noticed data is directed towards specific buckets inside delta directory though no bucketing is configured in corresponding DDL statement while creating that table. Following is example

hdfs dfs -ls /warehouse/tablespace/managed/hive/part.db/employee/delta_0000001_0000001_0000
Found 3 items /warehouse/tablespace/managed/hive/part.db/employee/delta_0000001_0000001_0000/bucket_00000_0 /warehouse/tablespace/managed/hive/part.db/employee/delta_0000001_0000001_0000/bucket_00001_0 /warehouse/tablespace/managed/hive/part.db/employee/delta_0000001_0000001_0000/bucket_00002_0

Can someone please help here in understanding the above directory structure of Hive ACID tables as there are total 3 buckets are present inside delta directory for the employee table?

Original Q&A

There are 1 best solutions below

Raid On 24 February 2023 at 03:16

If you are curious about the code then follow below link.

https://github.com/apache/hive/blob/36f5d91acb0fac00a5d46049bd45b744fe9aaab6/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java#L490

Basically this is done for delete operation in hive.

How bucketing works for Hive Acid Tables?

There are 1 best solutions below

Related Questions in HADOOP

Related Questions in HIVE

Related Questions in BIGDATA

Related Questions in DATA-WAREHOUSE

Related Questions in ACID

Trending Questions

Popular # Hahtags

Popular Questions