Neo4j - On-disk Representation of Edges

85 Views Asked by At

I noticed a performance difference when querying via incoming and outgoing relationships for a given node. In this case, outgoing was much faster.

The input file that generates the graph is sorted by the start node for each edge.

Does the order of the input file matter? Is there a difference in how the outgoing relationships are treated?

I read a bit of background on the internals, but didn't seem to answer my question about the difference in performance.

1

There are 1 best solutions below

0
On

There should be no difference. There's another diagram of how things are stored in Neo4j on page 12 of my MSc Thesis.

What might be causing the difference is the fact that you're running one test (the first one) with cold caches, and the other one with warm caches. If you flip your experiment and do outgoing first, then incoming, you may find incoming is suddenly faster! That's because data is on disk the first time around, then in memory the second time.