Finding longest branch of a tree in Neo4j

579 Views Asked by At

I currently have an append-only tree data structure in Java. The main purpose of this tree is to maintain a pointer to the longest branch. I implemented this by having a reference to the last nodes(s) in the longest branch(es) which is updated upon inserting new nodes in the tree.

For performance and persistence reasons I want to move this implementation to Neo4j using the Neo4j Java API. While working through the docs I couldn't find a convenient solution for querying a Neo4j database for the longest branch. In my implementation I can assure that the graph is an n-ary tree.

What is the preferred solution in Neo4j to find the longest branch in such a tree?

  • keeping a pointer to the last nodes as I do in my Java implementation?
  • molding an algorithm to find the longest path and implement this with the traversal API or via a cypher query?
  • some built-in functionality in Neo4j that I have not found yet?
1

There are 1 best solutions below

1
On

There is no built-in functionnality for searching a longest path.

One way to find it, is to find all the path between the root node and leaf nodes, then order them by length desc. and take the first one.

In cypher it should be done with something similar to :

MATCH (root:Root)-[:HAS_CHILD*]->(n:Node)
WHERE size((n)-[:HAS_CHILD]->())=0
RETURN n
ORDER BY size(RELS(p)) DESC
LIMIT 1

A better approach is to compute all those path in one action with a traversal. You can take a look at APOC with the apoc.path.expand procedure. Performances will be better.

Last solution, is to create your own algo with a custom traversal, that prune a branch if you have already found a longest one. (It can be done with a custom Evaluator). This algo will have better average performance than the previous solution, but in the worst scenario, the complexity is the same.

Like you said, you can keep a pointer to this node. It will be really fast for searching it, but you have to maintain this pointer when a write is done on your tree (with an impact on write performances).

Choice is yours, and depends of the performances you want to reach for reads and write.

Cheers