Sum of alias columns in PostgreSQL

618 Views Asked by At

I have a query aggregating three result columns:

pending_with_documents
pending_without_documents
inprocessing

SELECT COUNT(DISTINCT (CASE
                           WHEN status_code IN ('ACK', 'INT') THEN
                               CASE
                                   WHEN ack_no IN
                                        (SELECT ack_no FROM bescom_appl_upload_doc) THEN ack_no END
    END))        AS pending_with_documents,
       COUNT(DISTINCT (CASE
                           WHEN status_code IN ('ACK', 'INT') THEN
                               CASE
                                   WHEN ack_no NOT IN
                                        (SELECT ack_no FROM bescom_appl_upload_doc) THEN ack_no END
           END)) AS pending_without_documents,
       COUNT(CASE
                 WHEN status_code NOT IN ('ACK', 'INT', 'APR', 'REJ', 'SBK', 'OBJ') THEN
                     ack_no
           END)  AS Inprocessing
FROM application_ht_install
WHERE service_code IN (36);

Now I need a fourth column total summing up these three. So I tried:

SELECT COUNT(DISTINCT (CASE
                           WHEN status_code IN ('ACK', 'INT') THEN
                               CASE
                                   WHEN ack_no IN
                                        (SELECT ack_no FROM bescom_appl_upload_doc) THEN ack_no END
    END))                             AS pending_with_documents,
       COUNT(DISTINCT (CASE
                           WHEN status_code IN ('ACK', 'INT') THEN
                               CASE
                                   WHEN ack_no NOT IN
                                        (SELECT ack_no FROM bescom_appl_upload_doc) THEN ack_no END
           END))                      AS pending_without_documents,
       COUNT(CASE
                 WHEN status_code NOT IN ('ACK', 'INT', 'APR', 'REJ', 'SBK', 'OBJ')
                     THEN ack_no END) AS Inprocessing
    SUM((COUNT(DISTINCT(CASE WHEN status_code IN ('ACK','INT') THEN
    CASE WHEN ack_no IN
    (SELECT ack_no FROM bescom_appl_upload_doc) THEN ack_no END
    END)) AS pending_with_documents )+ ( COUNT(DISTINCT(CASE WHEN status_code  IN ('ACK','INT') THEN
    CASE WHEN ack_no NOT IN
    (SELECT ack_no FROM bescom_appl_upload_doc) THEN ack_no END
    END)) AS pending_without_documents)+ ( COUNT(CASE WHEN status_code NOT IN('ACK','INT','APR','REJ','SBK','OBJ') THEN
    ack_no
    END) AS Inprocessing))
FROM application_ht_install
WHERE service_code IN (36);

But I'm getting an error:

ERROR:  syntax error at or near "sum"
LINE 13: sum((count(distinct(case when status_code in ('ACK','INT') t...
              ^ SQL state: 42601 Character: 473

How to add up these three derived columns?

2

There are 2 best solutions below

3
On

You have several syntax errors in your statement. You were aliasing things that didn't need aliasing. Here's the code with the syntax errors corrected. Note: One column had no alias so I called it "do_not_know_name" because I cannot tell what you meant.

select 
count(distinct(case when status_code in ('ACK','INT') then 
case when ack_no in
(select ack_no from bescom_appl_upload_doc) then ack_no end 
end)) as pending_with_documents ,
count(distinct(case when status_code  in ('ACK','INT') then 
case when ack_no not in
(select ack_no from bescom_appl_upload_doc) then ack_no end 
end)) as pending_without_documents ,
count(case when status_code not in('ACK','INT','APR','REJ','SBK','OBJ') 
then ack_no end) as Inprocessing,
sum((count(distinct(case when status_code in ('ACK','INT') then 
case when ack_no in
(select ack_no from bescom_appl_upload_doc) then ack_no end 
end))  )+ ( count(distinct(case when status_code  in ('ACK','INT') then 
case when ack_no not in
(select ack_no from bescom_appl_upload_doc) then ack_no end 
end)) )+ ( count(case when status_code not in('ACK','INT','APR','REJ','SBK','OBJ') then 
ack_no
end) )) as do_not_know_name
from application_ht_install where service_code in (36) 
3
On

Basically, you can either repeat your lengthy expressions, or put everything in a subquery and sum in an outer SELECT. I suggest the later:

SELECT pending_with_documents
     , pending - pending_with_documents AS pending_without_documents
     , inprocessing
     , pending + inprocessing AS total
FROM  (
   SELECT count(DISTINCT a.ack_no) FILTER (WHERE status_code IN ('ACK', 'INT'))              AS pending
        , count(DISTINCT b.ack_no) FILTER (WHERE status_code IN ('ACK', 'INT'))              AS pending_with_documents
        , count(ack_no) FILTER (WHERE status_code <> ALL ('{ACK, INT, APR, REJ, SBK, OBJ}')) AS inprocessing
   FROM   application_ht_install a
   LEFT   JOIN (SELECT DISTINCT ack_no FROM bescom_appl_upload_doc) b USING (ack_no)
   WHERE  service_code = 36
   ) sub;

This might do what you are after. Depends on actual table definitions. If ack_no in either table or status_code can be NULL, you probably need to do more. See:

Also, the LEFT JOIN to a subquery may or may not be more efficient, depending on undisclosed schema details and cardinalities.

<> ALL (<array>) is equivalent to NOT IN (<list>), but shorter for longer lists and typically performs better. Both fail when NULL is involved ...

About the more efficient FILTER clause replacing the nested CASE expressions:

Note the use of a.ack_no, b.ack_no, and ack_no in the inner SELECT list. a.ack_no is really equivalent to just ack_no after left-joining with a USING clause. I only spelled it out for symmetry.

count(DISTINCT b.ack_no) works, because count() does not count NULL values - which you get for b.ack_no if not present in bescom_appl_upload_doc.