How to paginate Stack Exchange Data Explorer (SEDE) results?

135 Views Asked by At

Using data explorer to create queries:

SELECT P.id, creationdate,tags,owneruserid,answercount
--SELECT DISTINCT TAGNAME ,TAGID
FROM TAGS  AS T
JOIN POSTTAGS AS PT
ON T.ID = PT.TAGID
JOIN POSTS AS P
ON PT.POSTID = P.ID
--WHERE CAST(P.TAGS AS VARCHAR) IN('JAVA')
WHERE PT.TAGID = 3143

How is it possible to add pagination in the query in order to take not only the first 50,000 results, but then run the query again to take the next remaining results?

1

There are 1 best solutions below

1
On BEST ANSWER

There are a few ways to "page" through TSQL results; see:

Here I will use the CTE method as:

  • It uses convenient row numbers to page through results, rather than trying to track less predictable factors such as creationdate.
  • It reportedly performs faster than the OFFSET method.

So, that question's query becomes this SEDE query:

-- StartRow: Starting row for paging
-- EndRow: Ending row for paging (Max 50K rows at a time)
WITH allData AS (
    SELECT
                ROW_NUMBER() OVER (ORDER BY P.creationdate) AS row
                , P.id
                , P.creationdate
                , P.tags
                , P.owneruserid
                , P.answercount
    FROM        Posttags    AS PT
    JOIN        Posts       AS P    ON PT.postid = P.id
    WHERE       PT.tagid    = 3143  -- tag [scala]
)
SELECT      *
FROM        allData
WHERE       row    >= ##StartRow:INT?1##
AND         row    <= ##EndRow:INT?50000##
ORDER BY    row