I want to delete all Kudu RANGE partitions from the kudu table which has partition value less than a given date string. I am using following query but it's not working. Can someone please suggest what is the workaround.
alter table test_table drop if exists range partition values < '2010-01-31';
My Impala version is 2.6x and it appears to not work with '<' comparison. I can't use '=' because as it's going to be done dynamically and i need one single query to wipe off all empty kudu partitions before the passed date string.
Edit 1: Work-around I implemented. So, I had to write the script to fetch unique partition values and dynamically create the SQL which would give the same output after execution.
drop_partition_query=''
read -d '' drop_query <<EOF
select distinct partition_date from test_table where partition_date < '2021-01-01' group by partition_date
EOF
partitions_to_drop=$(${IMPALA_SHELL} -B -q "${drop_query}" --quiet "--output_delimiter=\n" 2>/dev/null)
partitions_array=( $partitions_to_drop )
for (( p=0; p< ${#partitions_array[@]}; p++ ))
do
drop_partition_query+="ALTER TABLE test_table DROP RANGE PARTITION VALUE='${partitions_array[$p]}';"
done
# drop_partition_query is the sql that should be executed to drop the partitions.
I think that on your version you cannot use such syntax, looks like this feature was added in Impala 2.8
Docu
Here is ticket in which it was added if you want to take a look: Jira issue
Not sure how to handle it, maybe you can write some script/spark shell code which will list all partitions and choose only does one which you want and concat them into one query which your Impala can handle