GDS Virtual Graph different than component queries

124 Views Asked by At

I am using Neo4j Enterprise v 4.3 and gds plugin v 1,7.0. I create a virtual graph, as follows:

 CALL gds.graph.create.cypher(
  "match_seg",
  "MATCH (d:DNA_Match) where d.ancestor_rn=33454
   RETURN id(d) AS id",
   "MATCH (m:DNA_Match{ancestor_rn:33454)-[r:match_segment]->(s:Segment{chr:'01'}) where r.cm>=7 and r.snp_ct>=500
   RETURN
     id(m) AS source,
     id(s) AS target,
     r.cm AS weight",
  {
    readConcurrency: 4,
    validateRelationships:FALSE
  }
)

It returns a node count of 29 and relationship count of zero.

Yet, when I run the individual queries in Neo4j I get a different result.

MATCH (d:DNA_Match) where d.ancestor_rn=33454
   RETURN id(d) AS id

returns 29 nodes

But here's the apparent anomaly:

MATCH (m:DNA_Match{ancestor_rn:33454})-[r:match_segment]->(s:Segment{chr:'01'}) where r.cm>=7 and r.snp_ct>=500
   RETURN
     id(m) AS source,
     id(s) AS target,
     r.cm AS weight

This returns 8726 rows.

Why am I not getting the relationships in my virtual graph?

1

There are 1 best solutions below

2
On BEST ANSWER

Your problem is that you only project a single node:

MATCH (d:DNA_Match) where d.ancestor_rn=33454
RETURN id(d) AS id

The GDS library drops all relationships where both source and target nodes are not present. You will want to do something like:

 CALL gds.graph.create.cypher(
  "match_seg",
  "MATCH (d:DNA_Match) where d.ancestor_rn=33454
   RETURN id(d) AS id
   UNION
   MATCH (m:DNA_Match{ancestor_rn:33454)-[r:match_segment]-> 
   (s:Segment{chr:'01'}) where r.cm>=7 and r.snp_ct>=500
   RETURN id(s) as id",
   "MATCH (m:DNA_Match{ancestor_rn:33454)-[r:match_segment]->(s:Segment{chr:'01'}) where r.cm>=7 and r.snp_ct>=500
   RETURN
     id(m) AS source,
     id(s) AS target,
     r.cm AS weight",
  {
    readConcurrency: 4,
    validateRelationships:FALSE
  }
)