dense_rank() and max, which one is better to find Nth highest salary (return null if no result)

2.4k Views Asked by At

e.g N is 2, I can either use densr_rank() or max to find second highest salary from Employee table.

select max(salary) as SecondHighestSalary from employee 
where salary < (select max(salary) from employee) 

above query works perfect, given the condition that if there is no second highest salary in the table, it returns NULL.

However I can achieve the similar results using dense_rank() function such as:

select e.salary as SecondHighestSalary 
from (
      select salary, dense_rank() OVER (order by salary desc) dr from employee) e
where dr = 2  

above query works fine given there is ONLY one second highest salary in the table, if the table has salaries [300, null, null] its returning [null, null], I need only one answer in the result set. how can I achieve that ?

follow up to the question: in this case, which is better (in terms of memory/processing time) max or dense_rank ()?

2

There are 2 best solutions below

0
On BEST ANSWER

This is a really interesting question. You haven't specified the database. But if I assume that you have an index on salary that can be used by the query (so possibly salary desc depending on the database). Then your first query might have really good performance:

  • Fetch the match salary using the index.
  • Start scanning the index for values less than the max.
  • Short circuit the scan because you get the max right away.

I'm not promising that all databases would generate this plan, but two index lookups would typically be faster than dense_rank() in this case.

Of course, with any question like this, you should test on your data and your database. That is really the correct answer.

0
On

Your query with max would only work with N = 2, so it is bad solution, but it should be better in terms of performance than second query with dense_rank.

I would suggest another approach: fetch only N first rows ordering by salary descending. Then, from this result get the lowest salary.

Fetching N rows should be lightning fast if you have index on salary column. Then having "only" N rows, it should be also fast to fetch the last record.

So query should look like (for SQL Server):

SELECT TOP 1 * FROM (
    SELECT TOP N * FROM SalaryTable ORDER BY salary DESC
) ORDER BY salary;

For those RDBMS which have LIMIT N instead of SELECT TOP N:

SELECT * FROM (
    SELECT * FROM SalaryTable ORDER BY salary DESC LIMIT N
) ORDER BY salary LIMIT 1;