Neo4j lowest common ancestor node not found

256 Views Asked by David A Stumpf At 28 June 2025 at 03:24

I have loaded a hierarchical tree (DAG) of DNA SNPs. I want to identify lowest common ancestors.

This query works, yield the single correct node:

Match (n:SNPNode{SNP:'R-Z11'}), (m:SNPNode{SNP:'R-BY13828'})
match path=(n)-[:SNPParent*..99]->(MRCA)<-[:SNPParent*..99]-(m) 
return MRCA.SNP

However, this one yields no result:

Match (n:SNPNode{SNP:'R-Z11'}), (m:SNPNode{SNP:'R-S25289'})
match path=(n)-[:SNPParent*..99]->(MRCA)<-[:SNPParent*..99]-(m) 
return MRCA.SNP

even though the two queries seeking ancestors of both yield nodes some of which are shared:

MATCH p=(n:SNPNode{SNP:'R-Z11'})-[r:SNPParent*..66]->(m) RETURN m.SNP

m.SNP
R-Z338
R-Z8
R-Z7
R-Z2
R-Z345
R-Z27
R-Z30
R-Z9
R-L48
R-Z301
R-Z381
R-U106
R-L151
R-L51
R-L23
R-M269
R-P297
R-L389
R-L754
R-M343

and

MATCH p=(n:SNPNode{SNP:'R-Z25289'})-[r:SNPParent*..66]->(m) RETURN m.SNP

m.SNP
R-S16701
R-S1774
R-Z341
**R-Z11**
R-Z338
R-Z8
R-Z7
R-Z2
R-Z345
R-Z27
R-Z30
R-Z9
R-L48
R-Z301
R-Z381
R-U106
R-L151
R-L51
R-L23
R-M269
R-P297
R-L389
R-L754
R-M343

It seems the problem is that R-Z11 is in the path of the second query and is itself the ancestor. In other words, sometimes the LCA is at the end of a shortest path. Is there a way to address this so that R-Z11 returns as the result where or not it is in the shortest path?

Original Q&A

There are 2 best solutions below

David A Stumpf On 05 December 2017 at 16:53

Here is the query that works:

match p=(n:SNPNode{SNP:'R-Z11'})<-[:SNPChild*0..99]-(MRCA:SNPNode)-[:SNPChild*0..99]->(m:SNPNode{SNP:'R-BY13828'}) 
return MRCA.SNP

Or, to get the lowest common ancestor (MRCA) with a boolean flag:

match p=(n:SNPNode{SNP:'R-Z11'})<-[:SNPChild*0..99]-(MRCA:SNPNode)-[:SNPChild*0..99]->(m:SNPNode{SNP:'R-BY13828'}) unwind(nodes(p)) as pn
return case when pn.SNP=MRCA.SNP then True else False end as MRCA,pn.SNP

with this output

MRCA SNP

FALSE R-Z11

FALSE R-Z338

TRUE R-Z8

FALSE R-BY13828

InverseFalcon On 28 September 2017 at 19:41

I think you'll want to ensure your variable-length paths have a lower bound of 0 (when you omit the lower bound, as in your current queries, it defaults to 1). This will make it possible for the start and end nodes to be considered as possible matches to MRCA.

Match (n:SNPNode{SNP:'R-Z11'}), (m:SNPNode{SNP:'R-S25289'})
match path=(n)-[:SNPParent*0..99]->(MRCA)<-[:SNPParent*0..99]-(m) 
return MRCA.SNP

Neo4j lowest common ancestor node not found

There are 2 best solutions below

Related Questions in NEO4J

Related Questions in CYPHER

Related Questions in SHORTEST-PATH

Related Questions in GENETIC

Related Questions in LOWEST-COMMON-ANCESTOR

Trending Questions

Popular # Hahtags

Popular Questions