I am doing some complex calculation in networkx. The calculation involves calculating some quantity again and again for each node in the network. Just as an example of such a calculation, suppose we want to calculate the average degree of neighbors of each node in the network and save that value as a node attribute. The following snippet works for me:
import networkx as nx
G = nx.erdos_renyi_graph(10, 0.5)
def ave_nbr_deg(node):
value = 0.
for nbr in G.neighbors(node):
value += G.degree(nbr)
G.node[node]['ave_nbr_deg'] = value/len(G.neighbors(node))
for node in G.nodes():
ave_nbr_deg(node)
print G.nodes(data = True)
This gives me:
[(0, {'ave_nbr_deg': 5.4}), (1, {'ave_nbr_deg': 5.0}), (2, {'ave_nbr_deg': 5.333333333333333}), (3, {'ave_nbr_deg': 5.2}), (4, {'ave_nbr_deg': 5.6}), (5, {'ave_nbr_deg': 5.6}), (6, {'ave_nbr_deg': 5.2}), (7, {'ave_nbr_deg': 5.25}), (8, {'ave_nbr_deg': 5.5}), (9, {'ave_nbr_deg': 5.5})]
Here itself I have a small doubt. The object G is created outside the function ave_nbr_deg and I have no idea how the function has its information even though I haven't declared it to be global.
Now, I want to use parallel python module to use all cores on my system for this calculation. After making some change to above code, I get the following code:
import networkx as nx
import pp
G = nx.erdos_renyi_graph(10, 0.5)
def ave_nbr_deg(node):
value = 0.
for nbr in G.neighbors(node):
value += G.degree(nbr)
G.node[node]['ave_nbr_deg'] = value/len(G.neighbors(node))
job_server = pp.Server(ppservers = ())
print "Starting pp with", job_server.get_ncpus(), "workers"
for node in G.nodes():
job_server.submit(ave_nbr_deg, args = (node,))()
print G.nodes(data = True)
But it returns following error:
An error has occured during the function execution
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/ppworker.py", line 90, in run
__result = __f(*__args)
File "<string>", line 5, in ave_nbr_deg
NameError: global name 'G' is not defined
I tried all sorts of things including module name nx in submit and so on. However, I am not getting what is the exact problem. Unfortunately, the documentation of smp is too short to solve this problem. I would be extremely grateful if anybody here could help me with this.
Thanks in advance
To answer your first question: Python first looks for variables in the local namespace. If it does not find them there, then it will move up to the parent namespace and look for it there. That's how it finds G, even though G is not locally declared. Here's more information on variable scopes: Python scoping
There are probably two solutions:
1 - pass G as an argument to the function ave_nbr_deg, e.g.
2 - declare G as a global variable in the submit args (from the Parallel Python documentation)
In this instance, calling the following should work: