Bulk Insertion in Py2neo

287 Views Asked by At

Im writing a custom doc manager for mongo-connector to replicate mongodb documents to neo4j. Here I would like to create bulk relationships. Im using py2neo2020.0.

It seems there are some options in previous versions but not in this version. Is there any way to create bulk nodes and relationships in py2neo

2

There are 2 best solutions below

0
On BEST ANSWER

I am currently working on bulk load functionality. There will be some new functions available in the next release. Until then, Cypher UNWIND...CREATE queries are your best bet for performance.

0
On

I would strongly recommend switching to the neo4j Python driver, as it's supported by Neo4j directly.

In any case, you can also do bulk insert directly in Cypher, and/or call that Cypher from within Python using the neo4j driver.

I recommend importing the nodes first, and then the relationships. It helps if you have a guaranteed unique identifier for the nodes, because then you can set up an index on that property before loading. Then you can load nodes from a CSV (or better yet a TSV) file like so:

// Create constraint on the unique ID - greatly improves performance.
CREATE CONSTRAINT ON (a:my_label) ASSERT a.id IS UNIQUE
;

// Load the nodes, along with any properties you might want, from
// a file in the Neo4j import folder.
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM "file:///my_nodes.tsv" AS tsvLine FIELDTERMINATOR '\t'
CREATE (:my_label{id: toInteger(tsvLine.id), my_field2: tsvLine.my_field2})
;

// Load relationships.
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM "file:///my_relationships.tsv" AS tsvLine FIELDTERMINATOR '\t'
    MATCH(parent_node:my_label)
        WHERE parent_node.id = toInteger(tsvLine.parent)
    MATCH(child_node:my_label)
        WHERE child_node.id = toInteger(tsvLine.child)
    CREATE(parent_node) --> (child_node)
;