graph DB equivalent of bipartite network projection?

650 Views Asked by At

Suppose a network with two kinds of nodes, say users and places, with the relationship "has been in". This is a bipartite network, and from it usually a module such as networkx provides the "projection" of the network in any of two directions, so that we can produce the network of places (with links having as weight the number of common users) or the network of users (with links having as weight the number of common places).

How am I suppossed to produce such networks in a Graph database? Could you provide examples for the most common open-sourced ones, say OrientDB, Neo4j, FlockDB...

More specifically, how to do it with gremlin?

In such case (gremlin) I myself have pasted an answer, but it starts from the surviving nodes, and it would be more efficient to start from the nodes which are going to be projected out, as usually the first step is some interval subselection

g.V.filter{it.date=='3/3/2003'}.filter{it.type=='place'}....

so that we are only interested on the network of users who are related by having been in the same place a given day, or some other interval.

2

There are 2 best solutions below

1
On

I can suggest the Marko Rodriguez's blog because contains many examples about this use cases. Marko is the author of Gremlin too and OrientDB and Neo4J are compliant with it.

1
On

Starting from the projection nodes, I have found a way in gremlin (besides, it works in OrientDB REST interface)

g.V.filter{it.type=='user'}.as('a').out('checkedIn_at').in('checkedIn_at')
.as('b').simplePath.select(['a','b']).groupCount(){it.name}.cap()

As it is an answer, I am self-answering me :-D

in some situations you have a huge database from which the graph is a subselection. I would prefer then an answer starting from something as

g.V.filter{it.date=='3/3/2003'}.filter{it.type=='place'}....

My guess here is

...sideEffect{x=it}.in.as('a').transform{x}.in.as('b').select(['a','b'])
.groupCount(){it.name}.cap()

The extant problem is that these patterns do not allow for arbitrary projection functions. A solution, I think, could be to list the common vertices for each pair of users, playing with the versatility of groupBy:

g.V.filter{it.type=='place'}.sideEffect{x=it}.out.as('a').transform{x}.out
.as('b').select(['a','b']).groupBy{[it[0],it[1]]}{x}.cap

groupBy, with a third parameter for post-processing, allows for a lot of MapReduce patterns.