Say I have the following table named t_student_details
:
Name Age Marks Sport City ........ (multiple columns)
====== ===== ======= ======= ======
Jason 11 45 tennis New York
Mark 12 42 football New York
Jessica 11 43 tennis Orlando
Brad 13 46 tennis Orlando
.
.
.
(multiple rows)
I want to get certain information about the students in a single query. This is what I would do in Postgres:
WITH sports_filter AS(
SELECT * FROM t_student_details WHERE sport='tennis'
)
SELECT JSON_BUILD_OBJECT('max_age', (SELECT MAX(age) FROM sports_filter),
'min_age', (SELECT MIN(age) FROM sports_filter),
'city_wise_marks_mean', (SELECT JSON_AGG(mean_items)
FROM (SELECT city, AVG(marks) FROM sports_filter
GROUP BY city) AS mean_items)
The result of the above SQL query in Postgres would be
{"max_age": 46,
"min_age": 43,
"city_wise_marks_mean": [{"New York": 45, "Orlando": 44.5}]}
As it is evident, I got multiple aggregations/information about students belonging to the sport 'tennis' in a single query. This way of querying also fetches only the necessary data and not everything.
How do I achieve this using Druid? I don't necessarily need the output response to be in the exact same format, but how do I fetch all these stats in the same query without having to fetch all the details of the students? Is it possible to get all this in a single query using Apache Druid?
As I understand your Postgres query it should be as simple as using GROUPING SETS
The
GROUP BY CUBE
is short-hand forGROUP BY GROUPING SETS ( (City), () )
This should result in one row where City is NULL which is your overall min/max/avg and one row for each city.