Disclaimer: I'm coming with more of a relational DB world, so I might come with some misconceptions on what the best practices are for storing and working with graph databases.
Anyway, let's say I have data with some hierarchy in it. Let's say I have the following hierarchy:
- Food / Fruit / Orange
- Food / Vegetable / Lettuce
- Food / Vegetable / Onion
- Dishes / Thai / Phad Thai
- Dishes / Thai / Larb Gai
- Dishes / Desert / Orange Cake
- Dishes / Dish / Ceasar Salad
And in addition,
In my graph, I have vertices for every last level item in the hierarchy and every one of them has 2 properties to know what the full hierarchy is. For example: Tomato has the properties level1: 'Food'
, level2: 'Fruit'
.
In addition, I have edges used_in
when some ingredient is used in a dish.
All edges are between vertices (last level items in the hierarchy).
Now, I would like to be able to look at the some higher level graph, based on level2
.
For example I would like to be able to see:
Fruit -> used_in -> Desert
Vegetable -> used_in -> Thai
And I want to query the graph such that I get the following result:
So is there some way to group vertices by some combination of fields (in this case - key is combination of level1 and level2 fields) such that the edges relating between those groups, will remain? If there some other way I should model my data? For example, adding labels based on all the items in the hierarchy?
To create the graph:
g.addV('Orange').property(id, 'Orange').property('level3', 'Orange').property('level2', 'Fruit').property('level1', 'Food')
.addV('Lettuce').property(id, 'Lettuce').property('level3', 'Lettuce').property('level2', 'Vegetable').property('level1', 'Food')
.addV('Onion').property(id, 'Onion').property('level3', 'Onion').property('level2', 'Vegetable').property('level1', 'Food')
.addV('Phad Thai').property(id, 'Phad Thai').property('level3', 'Spoon').property('level2', 'Thai').property('level1', 'Dishes')
.addV('Larb Gai').property(id, 'Larb Gai').property('level3', 'Fork').property('level2', 'Thai').property('level1', 'Dishes')
.addV('Orange Cake').property(id, 'Orange Cake').property('level3', 'Orange Crepe').property('level2', 'Desert').property('level1', 'Dishes')
.addV('Ceasars Salad').property(id, 'Ceasars Salad').property('level3', 'Ceasars Salad').property('level2', 'Salads').property('level1', 'Dishes')
.addE('used_in').from(g.V().has(id, 'Orange')).to(g.V().has(id, 'Orange Cake'))
.addE('used_in').from(g.V().has(id, 'Lettuce')).to(g.V().has(id, 'Ceasars Salad'))
.addE('used_in').from(g.V().has(id, 'Onion')).to(g.V().has(id, 'Phad Thai'))
.addE('used_in').from(g.V().has(id, 'Onion')).to(g.V().has(id, 'Larb Gai'))
.addE('used_in').from(g.V().has(id, 'Lettuce')).to(g.V().has(id, 'Larb Gai'))
.iterate()
Thanks in advance! :)
I re-formatted the graph creation steps and removed the
g.V()
and replaced with justV()
for all the mid traversalsteps. This will no longer work at TinkerPop 3.5.x and higher versions as that form was deprecated. It has bad side effects that most users do not realize. I think that changing the data model might be a good idea.Looking at the data - you are really using properties in a way that simulates what edges are good at. For example why not have edges with labels like
level1
and use those edges to connect the appropriate vertices? Anyway, here is the reformatted graph creation.As a start, and if you need more than this, I can edit the question, we can use a
path
step to generate the relationships shown in your diagrams. I would consider changing the data model though.Which finds: