Node Hash in ArangoDB?

65 Views Asked by At

I'm using ArangoDB for Graph-Versioning and would be looking for a faster method to evaluate whether or not a Node is the same in two different collections. Apart from hashing each node before I write it - does ArangoDB have any mechanism that lets me read the Hash of the node? I usually access the Database with Python-Arango.

If hashing it by myself is the only viable option what would be a reasonable Hash-Function for these types of documents in a Graph-DB? _id should not be included as the same node in two different collections would still differ. _rev would not really matter, and I am not sure if _key is in fact required as the node is identified by it any way.

1

There are 1 best solutions below

2
On

You need to make your own hash algo to do this.

The issue is that the unique values of a document that build the hash are user specific, so you need to build that hash value externally and save it with every document.

To confirm uniqueness, you can do that via a Foxx Microservice or in your AQL query, where you throw an error if multiple nodes are ever found with duplicate hashes.

If you want to enforce uniqueness on inserts, then you'll need to build that logic externally.

You then have the option of trusting your uniqueness or setting up a Foxx Microservice that would scour the collections in scope to ensure no other document had the same hash value.

The performance of querying many other collections would be poor, so an alternative to that is to set up a Foxx Queue that accepted document updates, and you then have a Foxx service performing the INSERT/UPDATE commands from the queue. That way you don't slow down your client application, and data will be eventually updated in Arango as fast as possible.