How can I make my SQL query for StackExchange shorter?

113 Views Asked by At

I was trying to track my performance here in SO using StackExchange - Query Stack Overflow and came up with this query:

At first I have this one to get the list of my answered questions:

Select Distinct a.Id, a.CreationDate, u.DisplayName, p.Title, p.Tags, 
   CONCAT('http://stackoverflow.com/questions/', p.Id, '#answer-', a.Id) as Link
From Posts a 
   Inner Join Posts p On a.ParentId = p.Id 
   Inner Join Users u On a.OwnerUserId = u.Id
   Inner Join PostTags pt on p.Id = Pt.PostId
   Inner Join Tags t on pt.TagId = t.Id
Where a.OwnerUserId in (
  9461114   --Me
  )
  And a.PostTypeId = 2  -- Answer
  And p.PostTypeId = 1  -- Question
  And t.TagName in ('jquery')

Then I did another query to get all the answers raw score (Upvote, Downvote, Accepted)

SELECT *
FROM (
  SELECT v.PostID as Id, vt.Name, COUNT(*) AS CNT
     FROM 
     Posts p
       INNER JOIN Votes v
         ON v.PostId = p.Id
       INNER JOIN VoteTypes vt
         ON v.VoteTypeId = vt.Id
     WHERE
     p.OwnerUserId in
     (
      9461114   --Me
      ) 
  GROUP BY 
  vt.Name, v.PostID) s_tab pivot (min(CNT)for [Name] in ([UpMod], [DownMod],[AcceptedByOriginator]))AS PVT

Then to make things easier, I used LEFT JOIN to merge the two queries that I have, and here's what I came up with:

Select a.Id, a.CreationDate, a.DisplayName, a.Title, a.Tags, a.Link, b.UpMod as Upvote, b.DownMod as DownVote, b.AcceptedByOriginator as Accepted
From 
(Select Distinct a.Id, a.CreationDate, u.DisplayName, p.Title, p.Tags, 
   CONCAT('http://stackoverflow.com/questions/', p.Id, '#answer-', a.Id) as Link
From Posts a 
   Inner Join Posts p On a.ParentId = p.Id 
   Inner Join Users u On a.OwnerUserId = u.Id
   Inner Join PostTags pt on p.Id = Pt.PostId
   Inner Join Tags t on pt.TagId = t.Id
Where a.OwnerUserId in (
  9461114   --Me
  )
  And a.PostTypeId = 2  -- Answer
  And p.PostTypeId = 1  -- Question
  And t.TagName in ('jquery')
  ) a 
Left Join 
  (SELECT *
FROM (
  SELECT v.PostID as Id, vt.Name, COUNT(*) AS CNT
     FROM 
     Posts p
       INNER JOIN Votes v
         ON v.PostId = p.Id
       INNER JOIN VoteTypes vt
         ON v.VoteTypeId = vt.Id
     WHERE
     p.OwnerUserId in
     (
      9461114   --Me
      ) 
  GROUP BY 
  vt.Name, v.PostID) s_tab pivot (min(CNT)for [Name] in ([UpMod], [DownMod],[AcceptedByOriginator]))AS PVT
  ) b on a.Id = b.Id

This is too long and I feel like there are some other ways to do this without making the query look like this.

I have tried and tested my query and am quite sure that it works just how I wanted it to. I just want it to be shorter. Thanks in advance!

2

There are 2 best solutions below

0
On

You want to see your answers on jquery questions.

In your query you are joining tags, thus having to account for multiple rows per answer, although the posts have a redundant tags string. You are doing this in order to limit your results to jquery questions, but by joining and possibly multiplying rows, you need DISTINCT later to get rid of what you produced. This is not considered good style. What you want is to look up tags. We look up things in the WHERE clause with EXISTS or IN.

As to getting and counting votes the most elegant way should be an OUTER APPLY - or even a CROSS APPLY as we are talking about aggregates - on your answers.

select
  a.id, a.creationdate,
  u.displayname,
  q.title, q.tags,
  v.up, v.down, v.accepted,
  concat('http://stackoverflow.com/questions/', q.id, '#answer-', a.id) as link
from posts a
join posts q on a.parentid = q.id
join users u on a.owneruserid = u.id
cross apply
(
  select
    sum(case when vt.name = 'UpMod' then 1 else 0 end) as up,
    sum(case when vt.name = 'DownMod' then 1 else 0 end) as down,
    sum(case when vt.name = 'AcceptedByOriginator' then 1 else 0 end) as accepted
  from votes v
  inner join votetypes vt on v.votetypeid = vt.id
  where v.postid = a.id
) v
where a.owneruserid = 9461114
and a.posttypeid = 2  -- answer
and q.posttypeid = 1  -- question
and exists
(
  select *
  from posttags pt
  join tags t on t.id = pt.tagid
  where pt.postid = q.id
  and t.tagname in ('jquery')
);

This is not the shortest query we can write for the task, but it's probably the most performant way to approach it. I also consider this query very readable and thus maintainable, because you can easily see for instance why we check PostTags and Tags.

0
On

The following query is the combination of your two inner queries (without pivoting).

    SELECT
      a.Id,
      a.CreationDate,
      u.DisplayName,
      p.Title,
      p.Tags,
      CONCAT('http://stackoverflow.com/questions/', p.Id, '#answer-', a.Id) AS Link,
      vt.Name AS VoteType,
      COUNT(v.Id) AS VoteCount
    FROM 
      Users u 
      INNER JOIN Posts a ON u.Id = a.OwnerUserId AND a.PostTypeId = 2  -- Answer
      INNER JOIN Posts p ON a.ParentId = p.Id  AND p.PostTypeId = 1  -- Question
      INNER JOIN PostTags pt ON p.Id = pt.PostId
      INNER JOIN Tags t ON pt.TagId = t.Id AND t.TagName in ('jquery')
      LEFT JOIN Votes v ON a.Id = v.PostId 
      LEFT JOIN VoteTypes vt ON v.VoteTypeId = vt.Id
    WHERE u.Id = 9461114   --Me
    GROUP BY
      a.Id,
      a.CreationDate,
      u.DisplayName,
      p.Title,
      p.Tags,
      p.Id,
      vt.Name

Everything is built around the user table and the appropriate filters were moved from the WHERE section directly to the JOIN. The votes were attached directly to the answers with a LEFT JOIN. The final, pivoted query looks like this:

SELECT *
FROM (
  SELECT
    a.Id,
    a.CreationDate,
    u.DisplayName,
    p.Title,
    p.Tags,
    CONCAT('http://stackoverflow.com/questions/', p.Id, '#answer-', a.Id) AS Link,
    vt.Name AS VoteType,
    COUNT(v.Id) AS VoteCount
  FROM 
    Users u 
    INNER JOIN Posts a ON u.Id = a.OwnerUserId AND a.PostTypeId = 2  -- Answer
    INNER JOIN Posts p ON a.ParentId = p.Id  AND p.PostTypeId = 1  -- Question
    INNER JOIN PostTags pt ON p.Id = pt.PostId
    INNER JOIN Tags t ON pt.TagId = t.Id AND t.TagName in ('jquery')
    LEFT JOIN Votes v ON a.Id = v.PostId 
    LEFT JOIN VoteTypes vt ON v.VoteTypeId = vt.Id
  WHERE u.Id = 9461114   --Me
  GROUP BY
    a.Id,
    a.CreationDate,
    u.DisplayName,
    p.Title,
    p.Tags,
    p.Id,
    vt.Name
) s_tab pivot (min(VoteCount) for [VoteType] in ([UpMod], [DownMod], [AcceptedByOriginator])) AS PVT