How does Phoenix table Search work with composite row key If I am searching only with one key

518 Views Asked by At

I have a Apache Phoenix table with composite rowkey (key1,key2). key1 = sequence number (unique) key2 = date with time stamp.(none unique)

Now when I am searching with key1 alone results are coming very quick even with 10 million records.

But when I am only using key2 it is slowing down.

My question is how does composite row key works in Phoenix? And what is the correct way to scan/filter based on individual keys which are part of the composite rowkey.

Because I don't know the key1 as this is a sequence if I have to filter it only using key2 which is a timestamp what is the best way of doing it ?

1

There are 1 best solutions below

0
On

Just in case other people come across this:

Phoenix is scanning HBase where keys are sorted in lexicographic order like in this simplified example:

100_2021:04:01
200_2021:03:01
300_2021:02:01

where the key starts with a 3 digit sequence number (100,200,300) and has a simplified date.

As you can see the initial portion, the sequence number is ascending even though the dates might be descending in this example. The order here is important. If you want to find all entries from '2021:02:01' phoenix still has to scan the entire cluster because the sequence number really could be anything. So you don't want to do a query that basically is a "*_date" query but instead always lead with data and maybe leave the something open at the end.

Depending on your case you probably want to put date first and then a sequence number at the end. Then you can look for all items a specific date. To avoid hot partitions you might want to have a salt or something else be the start of your key though.