How to view only a portion of the table when the LIMIT value is derived from a subquery in Mysql

89 Views Asked by At

For table 'Issue'

-Issue

Call_ref is the unique key identifying every call uniquely. Each call is made by a user with users identified through a caller_id. Each user can make multiple calls but every call will have only one caller_id. I want to display calls made by top 20% of active users. I tried this query-

SELECT Caller_id, COUNT(Call_ref) FROM Issue
GROUP BY Caller_id 
ORDER BY COUNT(Call_ref) 
LIMIT round(COUNT(distinct Caller_id)/5)

However, it seems like LIMIT doesn't accept anything but a number. Is there a way I can restrict this view to only top 20% of all records in the resulting table?

5

There are 5 best solutions below

0
On BEST ANSWER

With ROW_NUMBER() AND COUNT(*) window functions:

SELECT Caller_id, Counter
FROM (
  SELECT Caller_id, COUNT(*) Counter,
         ROW_NUMBER() OVER (ORDER BY COUNT(*) DESC) rn,
         COUNT(*) OVER () total
  FROM Issue
  GROUP BY Caller_id 
) t
WHERE rn / total <= 0.2
ORDER BY rn
0
On

You could do this with window functions, if you are running MySLQ 8.0. Based on your current query, that would be:

select *
from (
    select caller_id, count(*) , percent_rank() over(order by count(*) desc) prn
    from issue
    group by caller_id 
) t
where prn < 0.2
0
On

You can use dynamic sql

SELECT CONCAT('SELECT Caller_id, COUNT(Call_ref) FROM Issue
GROUP BY Caller_id 
ORDER BY COUNT(Call_ref) 
LIMIT ',round(COUNT(distinct Caller_id)/5)) into @sql FROM Caller_id;
PREPARE stmt1 FROM @sql;
EXECUTE stmt1;
DEALLOCATE PREPARE stmt1;
0
On

Trye this code

SET @row_number = 0;
SET @rows = (select count(distinct Caller_id) from Issue)/5;
select Caller_id, _counter  from ( SELECT (@row_number:=@row_number + 1) AS num ,  Caller_id, COUNT(_counter) cnt FROM Issue
GROUP BY Caller_id  FROM  Issue ) x where num < @rows
order by _counter
0
On

You can do this with window functions, but the solution of GMB is wrong (Proof: db<>fiddle link). Unfortunately I am not allowed to comment.

Here is my solution:

CREATE TABLE ranked (
SELECT
  a.caller_id
  ,a.count_calls
  ,RANK() OVER w AS 'ranked'
  ,ROW_NUMBER() OVER w AS 'row_num'
FROM
  (
  SELECT
    caller_id
    ,COUNT(*) count_calls
  FROM
        issue
  GROUP BY 1
    ORDER BY 2 DESC
    ) a
WINDOW w AS (ORDER BY count_calls DESC)
);

CREATE TABLE max_row_num (
SELECT MAX(row_num) AS max_row_num FROM ranked
);

CREATE TABLE result (
SELECT
  c.*
FROM
  issue c
  INNER JOIN
  (
    SELECT
    a.Caller_Id
    ,a.count_calls
    FROM
    ranked a
    INNER JOIN (SELECT (max_row_num * 20 /100)per FROM max_row_num) b
    WHERE
    a.ranked <= b.per        
    ) d ON (c.Caller_id = d.Caller_id)  
)

SELECT * FROM result;

Solution: db<>fiddle link