Neo4j Cypher Query - return counts of relationships in separate columns

430 Views Asked by At

I have experience with Neo4j and Cypher, but still struggle with aggregate functions. I'm trying to pull a CSV out of Neo4j that should look like this:

Location Number of Node X at Location Number of Node Y at Location
ABC 5 20
DEF 39 4
Etc. # #

My current query looks like this:

MATCH (loc:Location)--(x:Node_X)
RETURN loc.key AS Location, count(x) AS `Number of Node X at Location`, 0 AS `Number of Node Y at Location`
UNION
MATCH (loc:Location)--(y:Node_Y)
RETURN loc.key AS Location, 0 AS `Number of Node X at Location`, count(y) AS `Number of Node Y at Location`

Which yields a table like:

Location Number of Node X at Location Number of Node Y at Location
ABC 5 0
DEF 39 0
Etc. # #
ABC 0 20
DEF 0 4
Etc. # #

I think I'm close, but I have double the number of Location rows as I need, and am not sure how to make the results more succinct. Suggestions on this and generally tips for aggregate functions are appreciated!

4

There are 4 best solutions below

1
On

You can aggregate with distinct here.

MATCH (loc:loc1)--(x:Node_X), (loc)--(y:Node_Y) 
RETURN loc.key , 
       count(distinct(x)) as NODES_OF_TYPE_X, 
       count(distinct(y)) as NODES_OF_TYPE_Y

The problem in accessing x and y in above query is it changes the cardinality of the solution. For each solution of x, it will have all the solutions of y. If you had n1 x nodes and n2 y nodes and you don't use distinct, then you would get n1*n2 nodes for each x and y.

3
On

I think You can solve it like this with even when counts are 0

MATCH (loc:loc1)
RETURN loc.type , 
       size((loc)--(:Node_X)) AS xCount,
       size((loc)--(:Node_Y)) AS yCount

You can also do

MATCH (loc:loc1)
RETURN loc.type , 
       size([(loc)—-(x:Node_X) | x]) AS xCount
0
On

@PrashantUpadhyay got the answer started, but I think this is the final answer I was looking for. It accounts for cases where counts may return zero, but still includes all Location rows.

MATCH (loc:Location)
OPTIONAL MATCH (loc)--(x:Node_X)
OPTIONAL MATCH (loc)--(y:Node_Y)
RETURN loc.key AS Location, 
       coalesce(count(distinct(x)), 0) as Node_X, 
       coalesce(count(distinct(y)), 0) as Node_Y
ORDER BY Location
0
On

I'd try this:

MATCH (loc:Location)
with distinct loc
OPTIONAL MATCH (loc)--(x:Node_X)
WITH distinct loc, count(x) AS xnum
OPTIONAL MATCH  (loc)--(y:Node_Y)
WITH DISTINCT loc, count(y) AS ynum, xnum

RETURN 
DISTINCT loc.key as Location, 
xnum as `Number of Node X at Location`, 
ynum as `Number of Node Y at Location`