I am trying to query all the IDs of the records from /ExampleRegion. I want to retrieve records if the count of the ID is only 1, so there is only 1 record in the region with that ID.
SELECT COUNT(*), id from /ExampleRegion group by id --> Only if the count for that id is 1.
How can I use COUNT as a Condition in the WHERE Condition?
I have tried the following but it doesn't work:
SELECT * from /ExampleRegion a where (SELECT count(*) as c, b.id from /ExampleRegion b where b.id = a.id and c = 1)
SELECT * from /ExampleRegion a where (SELECT count(*) as c from /ExampleRegion b where b.id = a.id ) = 1
I would think GROUP BY would work, although I still can't seem to find the correct OQL.
Much appreciated.
In short...
Regarding...
You cannot use an aggregate OQL query function like
count
in a predicate of theWHERE
clause (as I suspect you already found out), such as:This results in the following Exception:
Additionally, and unfortunately, the following OQL query:
Returns no results!
The opposite OQL query used to find duplicates also returns no results:
Although, I am not entirely certain why, I suspect it is due to the same limitation as the first OQL query above where the
count
aggregate function was used in an OQL query predicate inside theWHERE
clause, except this later form is less informative (e.g. like I suspect it might be eating an Exception somewhere since, according to GemFire, the OQL query is syntactically correct).However, if you only care about the IDs then you can simply run a similar OQL query:
Of course, this OQL query is returning a projection (or GemFire
Struct
(Javadoc)) that returns a count of all User IDs (duplicate and unique). Clearly, if the count for a User ID is 1, then it is unique, and if it is greater than 1, a duplicate (i.e. not unique).In detail...
Typically though, users want to get access to the actual object (e.g.
User
) when theUser
instance either has a unique ID (in your case) or a duplicate ID. Users do this to perform some operation on the Region entry value (e.g.User
) returned by the OQL query, which is particularly common inside Functions used to process a PARTITION Regions in a parallel and distributed fashion.But, I have to admit, I am bit dumbfounded by not being able to (completely) solve this problem.
I honestly thought this problem should have been solvable with the following GemFire OQL query:
Essentially, this OQL query selects all
Users
where their ID is unique because they are 1 of a kind.Strangely, this results in a GemFire
QueryInvalidException
:There is nothing more blatantly irritating in software to me than NPEs! They are a clear and present programmer error; not a user error!
Seemingly, GemFire is not happy with the nested OQL query declared in the
FROM
clause, which would in essence create a queryable collection, or intermediate result set used in the outer query (much like a RDBMS temporary table):And perhaps, GemFire/Geode is specifically not happy about the "projection" of this nested (temporary) collection, hence the NPE here:
When I look at the affected GemFire/Geode code, the exact condition really makes no sense to me since I tested with a
ClientCache
using aLOCAL
(only) Region. #sighNevertheless, I even tried to test with a peer
Cache
instance using aPARTITION
Region (with PDX enabled (required for PRs actually)) and that led to the same result! #sighGiven the GemFire query engine is seemingly having trouble in the projection of the nested OQL query (containing the
count
andGROUP BY
clause) I decided to try to provide more information to the query engine in hopes of better inform the query engine about the projected values. So, I created theUserIdCount
projection class type and used it in my OQL query like so:Of course, and unfortunately, this also did not have the intended effect and only led to the following Exception:
It seems I am stuck with a GemFire
Struct
, which you'd think GemFire would know how to process in a nested query when accessing the projection values in the outer query. But, whatever!I really feel like the NPE is an unintended consequence from GemFire and that GemFire really ought to be able to (and, possibly can) handle this type of OQL query.
So, what are you left with.
Well, as I stated above, if all you care about is the IDs, then you can return all IDs with their counts and iterate the List of Struts to find the IDs with a count of 1.
Of course, if you are ultimately interested in the objects with unique (or perhaps, duplicate) IDs to perform additional processing, then you will need to break this into 2 individual and separate OQL queries, first to get the IDs of interest, and then use those IDs to get the objects/values (e.g.
Users
) in another query.I have demonstrated this 2-phase query approach for your use case (i.e. unique IDs) in this test case.
Anyway, I hope this gives you a few options or things to think about.
Cheers!