Count(Distinct x) and Group By y

3k Views Asked by At

I have entries with date and time. I want the results to be grouped by the hour (00, 01, 02) and that works, but when i want to get the distinct counts of users, there is an error.

Select Substr(time, 0, 2) as Hour,
 Count(date) as Hits,
 Count(Distinct ip) as Users,
 Count(Distinct X-Forwarded-For) as ForwardedUsers
From table 
Group By Hour

EDIT: I am using the LogParser from Microsoft and i am able to use Group By Hour as it is and X-Forwarded-For is also no problem. The question is how i can use Count(Distinct ip) within the group by

2

There are 2 best solutions below

0
On BEST ANSWER

Unfortunately LogParser does not support DISTINCT aggregate functions together with GROUP BY. This should have been clear from the error message you get with the query above:

Error: Semantic Error: aggregate functions with DISTINCT arguments are not supported with GROUP BY clauses

One trick you could do is to remove the GROUP BY clause altogether and calculate instead COUNT DISTINCT(hourlyIp) where hourlyIp is a string built concatenating the hour with the IP address. You'll have then to breakup the hourlyIp field back into its components when you process the results.

1
On

Most database engines to not allow you to group by alias names. Change this:

Group by Hour

to this:

Substr(time, 0, 2)