I am applying the same function to multiple independent objects and I'd like to do it in parallel. The problem is the function modifies one of its arguments. This is fine for map but not pmap. Here is a minimal reproducible example:
@everywhere function testmod!(a,μ)
for i=1:length(a)
a[i]=i*μ
end
b=copy(a)
return b
end
myarrays=[zeros(Float64,10) for i=1:10]
pmap((a1,a2)->testmod!(a1,a2),myarrays,[i for i=1:10])
This toy function modifies the elements of the input array a. I'll compare the results of map and pmap:
map
julia> myarrays=[zeros(Float64,10) for i=1:10]
10-element Array{Array{Float64,1},1}:
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
julia> map((a1,a2)->testmod!(a1,a2),myarrays,[i for i=1:10])
10-element Array{Array{Float64,1},1}:
[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]
[2.0,4.0,6.0,8.0,10.0,12.0,14.0,16.0,18.0,20.0]
[3.0,6.0,9.0,12.0,15.0,18.0,21.0,24.0,27.0,30.0]
[4.0,8.0,12.0,16.0,20.0,24.0,28.0,32.0,36.0,40.0]
[5.0,10.0,15.0,20.0,25.0,30.0,35.0,40.0,45.0,50.0]
[6.0,12.0,18.0,24.0,30.0,36.0,42.0,48.0,54.0,60.0]
[7.0,14.0,21.0,28.0,35.0,42.0,49.0,56.0,63.0,70.0]
[8.0,16.0,24.0,32.0,40.0,48.0,56.0,64.0,72.0,80.0]
[9.0,18.0,27.0,36.0,45.0,54.0,63.0,72.0,81.0,90.0]
[10.0,20.0,30.0,40.0,50.0,60.0,70.0,80.0,90.0,100.0]
julia> myarrays
10-element Array{Array{Float64,1},1}:
[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]
[2.0,4.0,6.0,8.0,10.0,12.0,14.0,16.0,18.0,20.0]
[3.0,6.0,9.0,12.0,15.0,18.0,21.0,24.0,27.0,30.0]
[4.0,8.0,12.0,16.0,20.0,24.0,28.0,32.0,36.0,40.0]
[5.0,10.0,15.0,20.0,25.0,30.0,35.0,40.0,45.0,50.0]
[6.0,12.0,18.0,24.0,30.0,36.0,42.0,48.0,54.0,60.0]
[7.0,14.0,21.0,28.0,35.0,42.0,49.0,56.0,63.0,70.0]
[8.0,16.0,24.0,32.0,40.0,48.0,56.0,64.0,72.0,80.0]
[9.0,18.0,27.0,36.0,45.0,54.0,63.0,72.0,81.0,90.0]
[10.0,20.0,30.0,40.0,50.0,60.0,70.0,80.0,90.0,100.0]
This works as desired. In contrast pmap:
pmap
julia> myarrays=[zeros(Float64,10) for i=1:10]
10-element Array{Array{Float64,1},1}:
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
julia> pmap((a1,a2)->testmod!(a1,a2),myarrays,[i for i=1:10])
10-element Array{Any,1}:
[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0]
[2.0,4.0,6.0,8.0,10.0,12.0,14.0,16.0,18.0,20.0]
[3.0,6.0,9.0,12.0,15.0,18.0,21.0,24.0,27.0,30.0]
[4.0,8.0,12.0,16.0,20.0,24.0,28.0,32.0,36.0,40.0]
[5.0,10.0,15.0,20.0,25.0,30.0,35.0,40.0,45.0,50.0]
[6.0,12.0,18.0,24.0,30.0,36.0,42.0,48.0,54.0,60.0]
[7.0,14.0,21.0,28.0,35.0,42.0,49.0,56.0,63.0,70.0]
[8.0,16.0,24.0,32.0,40.0,48.0,56.0,64.0,72.0,80.0]
[9.0,18.0,27.0,36.0,45.0,54.0,63.0,72.0,81.0,90.0]
[10.0,20.0,30.0,40.0,50.0,60.0,70.0,80.0,90.0,100.0]
julia> myarrays
10-element Array{Array{Float64,1},1}:
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
Clearly myarray has not been modified. Is there any way to achieve this with pmap or can you only return things with pmap.
First it is unnecessary to do
(a1,a2)->testmod!(a1,a2)
, you can just passtestmod!
.ie.
pmap(testmod!,myarrays,[i for i=1:10])
Second, the return value is not used in either case. You are just modifying the array in place in the first (
map
) example. In thepmap
example, the array has to be copied to each worker process, so it is only being modified on the worker process itself.What does matter for
map
orpmap
(and the intended way of using them) is the value that is returned by each function (which is the same).You may notice that
myarray
is changed when using pmap if you don't have any worker processes. This is because the array was not copied anywhere and all modifications were done locally.Perhaps you could make use of a
SharedArray
to achieve what you want, or restructure your example so your ultimate goal is more clear.tl;dr -- pmap can modify its arguments, but the result probably isn't what you expect.