I'm running Neo4j Desktop v1.4.1 the db is 4.2.1 enterprise.
I have a simple graph of placements, campaigns and a placement to campaign "contains" relationship. This is a fresh dataset, every node is unique. Some placements "contain" thousands of campaigns, so I want to filter the returned campaigns by an inclusion list of campaign ids.
When I return all the matched nodes it works:
neo4j@neo4j> MATCH (:Placement {id: 5})-[:CONTAINS]->(c:Campaign)
WHERE c.id IN [400,263,150470,25810,37578]
RETURN *;
+--------------------------+
| c |
+--------------------------+
| (:Campaign {id: 37578}) |
| (:Campaign {id: 263}) |
| (:Campaign {id: 25810}) |
| (:Campaign {id: 150470}) |
+--------------------------+
When I request just the campaign:id, I get duplicates:
neo4j@neo4j> MATCH (:Placement {id: 5})-[:CONTAINS]->(c:Campaign)
WHERE c.id IN [400,263,150470,25810,37578]
RETURN c.id;
+--------+
| c.id |
+--------+
| 150470 |
| 150470 |
| 150470 |
| 150470 |
+--------+
There is only one CONTAINS relationship between placement 5 and campaign 15070:
neo4j@neo4j> MATCH (:Placement {id: 5})-[rel:CONTAINS]->(:Campaign {id:150470})
RETURN count(rel);
+------------+
| count(rel) |
+------------+
| 1 |
+------------+
EXPLAIN returns the following query plan, the cache[c.id]
seems like it might be the culprit?
+---------------------------+------------------------------------------------------------------------------------------------------+----------------+---------------------+
| Operator | Details | Estimated Rows | Other |
+---------------------------+------------------------------------------------------------------------------------------------------+----------------+---------------------+
| +ProduceResults@neo4j | `c.id` | 4 | Fused in Pipeline 1 |
| | +------------------------------------------------------------------------------------------------------+----------------+---------------------+
| +Projection@neo4j | cache[c.id] AS `c.id` | 4 | Fused in Pipeline 1 |
| | +------------------------------------------------------------------------------------------------------+----------------+---------------------+
| +Expand(Into)@neo4j | (anon_7)-[anon_27:CONTAINS]->(c) | 4 | Fused in Pipeline 1 |
| | +------------------------------------------------------------------------------------------------------+----------------+---------------------+
| +MultiNodeIndexSeek@neo4j | UNIQUE anon_7:Placement(id) WHERE id = $autoint_0, cache[c.id], UNIQUE c:Campaign(id) WHERE id IN $a | 25 | In Pipeline 0 |
| | utolist_1, cache[c.id] | | |
+---------------------------+------------------------------------------------------------------------------------------------------+----------------+---------------------+
Edit: if I prepend the query with CYPHER runtime=SLOTTED
I get the expected output:
+--------+
| c.id |
+--------+
| 37578 |
| 263 |
| 25810 |
| 150470 |
+--------+
If I omit the WHERE clause I get unique campaign ids (but too many). I feel like I'm missing something obvious, but I've read the neo4j docs and I'm not getting it. Thanks!