I need this function to be optimized as I am trying to make my OpenGL simulation run faster. I want to use Parakeet, but I can't quite understand in what way I would need to modify the code below in order to do so. Can you see what I should do?
def distanceMatrix(self,x,y,z):
" ""Computes distances between all particles and places the result in a matrix such that the ij th matrix entry corresponds to the distance between particle i and j"" "
xtemp = tile(x,(self.N,1))
dx = xtemp - xtemp.T
ytemp = tile(y,(self.N,1))
dy = ytemp - ytemp.T
ztemp = tile(z,(self.N,1))
dz = ztemp - ztemp.T
# Particles 'feel' each other across the periodic boundaries
if self.periodicX:
dx[dx>self.L/2]=dx[dx > self.L/2]-self.L
dx[dx<-self.L/2]=dx[dx < -self.L/2]+self.L
if self.periodicY:
dy[dy>self.L/2]=dy[dy>self.L/2]-self.L
dy[dy<-self.L/2]=dy[dy<-self.L/2]+self.L
if self.periodicZ:
dz[dz>self.L/2]=dz[dz>self.L/2]-self.L
dz[dz<-self.L/2]=dz[dz<-self.L/2]+self.L
# Total Distances
d = sqrt(dx**2+dy**2+dz**2)
# Mark zero entries with negative 1 to avoid divergences
d[d==0] = -1
return d, dx, dy, dz
From what I can tell, Parakeet should be able to use the above function without modifications - it only uses Numpy and math. But, I always get the following error when calling the function from the Parakeet jit wrapper:
AssertionError: Unsupported function: <bound method Particles.distanceMatrix of <particles.Particles instance at 0x04CD8E90>>
Parakeet is still young, its NumPy support is incomplete, and your code touches on several features that don't yet work.
1) You're wrapping a method, while Parakeet so far only knows how to deal with functions. The common workaround is to make a @jit wrapped helper function and have your method call into that with all of the required member data. The reason that methods don't work is that it's non-trivial to assign a meaningful type to 'self'. It's not impossible, but tricky enough that methods won't make their way into Parakeet until lower hanging fruit are plucked. Speaking of low-hanging fruit...
2) Boolean indexing. Not yet implemented but will be in the next release.
3) np.tile: Also doesn't work, will also probably be in the next release. If you want to see which builtins and NumPy library functions will work, take a look at Parakeet's mappings module.
I rewrote your code to be a little friendlier to Parakeet:
On my machine this runs only ~3x faster than NumPy for N = 2000 (0.39s for NumPy vs. 0.14s for Parakeet). If I rewrite the array traversals to use loops more explicitly then the performance goes up to ~6x faster than NumPy (Parakeet runs in ~0.06s):
With a little creative rewriting you can also get the above code running in Numba, but it only goes ~1.5x faster than NumPy (0.25 seconds). The compile times were Parakeet w/ comprehensions: 1 second, Parakeet w/ loops: .5 seconds, Numba w/ loops: 0.9 seconds.
Hopefully the next few releases will enable more idiomatic use of NumPy library functions, but for now comprehensions or loops are often the way to go.