Problems with multiple joins and a sum

525 Views Asked by At

If have the following three PostgreSQL tables:

Post table:

postid | title | author | created

Vote table:

postid | username | vote

where vote is equal to 1 if the user voted the post up, 0 if the user did not vote and -1 if the user voted the post down.

Comment table:

commentID | parentID | postID | content | author | created

where the parentID is null if the comment is not a reply.

I want to receive now for every post its title, author, created date, sum of all votes and the vote of the current logged in user and the number of comments. I already had problems with the vote of the user and asked here and someone helped me to get the following query:

SELECT post.postID as postID, post.title as title, post.author as author,
       post.created as created,
       COALESCE(sum(votes.vote), 0) as voteCount,
       COALESCE(sum(votes.vote) FILTER (WHERE votes.username = :username), 0) as userVote
       FROM post 
       LEFT JOIN votes ON post.postID = votes.postID 
       GROUP BY post.postID 
       ORDER BY voteCount DESC

Now I tried another LEFT JOIN to fetch the number of comments like this:

COUNT(DISTINCT comments) FILTER (WHERE comments.parentID IS NULL) as numComments
LEFT JOIN comments on post.postID = comments.postID

However, while the number of comments work, the number of votes on each post is wrong since due to the other join the rows seem to appear multiple times yielding a wrong sum and I have some trouble figuring out a way to solve this.

I already tried to fetch the number of comments as a subquery so that it is independent from the number of votes without success. Any further help would be very appreciated! :-)

2

There are 2 best solutions below

1
On BEST ANSWER

You would typically pre-aggregate in subqueries before joining, like so:

SELECT p.*
    COALESCE(v.voteCount, 0) as voteCount,
    COALESCE(v.userVote, 0) as userVote,
    COALESCE(c.numComments, 0) as numComments
FROM post p
LEFT JOIN (
    SELECT postID, 
        SUM(vote) as voteCount, 
        SUM(vote) FILTER (WHERE username = :username) userVote
    FROM votes 
    GROUP BY postID
) v ON v.postID = p.postID 
LEFT JOIN (
    SELECT postID, count(*) numComments
    FROM comments
    WHERE parentID IS NULL
    GROUP BY postID
) c ON c.postID = p.postID
ORDER BY voteCount DESC
0
On

Count the values separately. The joins are causing a Cartesian product. This is a place where correlated subqueries or lateral joins help:

SELECT p.*, v.*, c.*
FROM post p CROSS JOIN LATERAL
     (SELECT SUM(v.vote) as voteCount,
             SUM(v.vote) FILTER (WHERE v.username = :username), 0) as userVote
      FROM votes v
      WHERE p.postID = v.postID 
     ) v CROSS JOIN LATERAL
     (SELECT SUM(c.vote) as commentCount,
             SUM(c.vote) FILTER (WHERE c.username = :username), 0) as userVote
      FROM comments c
      WHERE p.postID = c.postID 
     ) c
ORDER BY voteCount DESC;