Using the Data Explorer (SEDE), I would like to find which users have more than 200000 reputation on Stack Overflow, and then find details for any accounts they have on other Stack Exchange sites.
Here is the query which provides the list with this threshold:
Select id, reputation, accountid
From users
Where reputation > 200000
AccountId
is the key for all Stack Exchange sites.
I have found this query for aggregating across SEDE databases, but how is it possible to do that based on the dynamic results of the previous/baseline query?
Here is the kind of output I'm aiming for:
id_so, reputation_so, accounted, other_stackexchange_site_name, reputation_othersite, number_of_answers_other_site, number_of_questions_other_site
1, 250000, 23, serverfault, 500, 5, 1
1, 250000, 23, superuser, 120, 1, 0
2, 300000, 21, serverfault, 300, 3, 2
2, 300000, 21, webmasters, 230, 1, 1
3, 350000, 20, NA, NA, NA, NA
#the case with id 3 has an SO profile with reputation but it has no other profile in other Stack Exchange site
To run non-trivial queries across databases, based on an initial query:
AccountId
(which is a user's Stack-Exchange-wide Id).Create your initial query to feed that key into a temp table. In this case:
Determine the query, to run on each site, that gets the info you want. EG:
Use a system query to get the appropriate databases. For the Data Explorer (SEDE), a query of this type:
Create a cursor on the above query and use it to step through the databases.
For each database:
sp_executesql
.When the cursor is done, perform the final query on the temp table from step 3.
Refer to this other answer, for a working template for querying all of the Stack Exchange sites.
Putting it all together, results in the following query, which you can run live on SEDE:
It gives results like:
-- where the blue values are hyperlinked.
Note that a user must have 200 rep on a site for it to be "significant". That's also the rep needed for the site to be included in the Stack Exchange flair.