SQL - Optimize the execution time

76 Views Asked by At

I am trying to fetch data from stackexchange database. My query is:

select distinct top 50 U.Id, U.DisplayName, U.Reputation,
       Tags = stuff( (SELECT ','+p2.Tags 
                       FROM posts p2 join votes V on p2.id = V.PostId
                       where V.VoteTypeId=5 and V.UserId = U.id
                       order by p2.CreationDate DESC
                       FOR XML PATH, TYPE).value('.[1]','nvarchar(max)')
                    ,1,1,'')
from Users U 
order by U.Reputation DESC;

However, when I run the query on data.stackexchange.com, it shows an error saying:Execution Timeout Expired. Is there any way that I can modify the query to optimize the execution time so that I can run this query successfully?

1

There are 1 best solutions below

2
On BEST ANSWER

You need exclude subqueries with individual XML. This will require building common XML with structure <user ...><tag1><tag2>...</user><user>... (using EXPLICIT MODE). And parse it to rows for individual user's.

with U as(
    select top 50 id, Reputation, DisplayName
      from users order by Reputation DESC
),
V as(
 select U.*,p2.Tags, p2.CreationDate
  from U, posts p2, votes V
 where p2.id = V.PostId and V.VoteTypeId=5 and V.UserId = U.id
),
Q(XML) as(
    select tag,parent,id [user!1!id],Reputation [user!1!Reputation],Name [user!1!Name], T [T!2!!element]
      from (
        select 1 tag, NULL parent, U.id, U.Reputation, U.DisplayName Name, NULL t, NULL dt
          from U
         union all
        select 2, 1, V.id, NULL, NULL, ','+Tags, CreationDate
          from V
     ) X
     order by id, tag, dt desc
       for xml explicit, type
)
select node.value('(@id)[1]','int'),
       node.value('(@Reputation)[1]','int'),
       node.value('(@Name)[1]','nvarchar(max)'),
       stuff(node.value('.[1]','nvarchar(max)'),1,1,'')
  from Q
 cross apply XML.nodes('/user') as Y(node)
 order by 2 desc