I tried solution presented here on Stackoverflow by User: henry-gomersall to repeat speed up FFT based convolution, but obtained different result.
import numpy as np
import pyfftw
import scipy.signal
import timeit
class CustomFFTConvolution(object):
def __init__(self, A, B, threads=1):
shape = (np.array(A.shape) + np.array(B.shape))-1
if np.iscomplexobj(A) and np.iscomplexobj(B):
self.fft_A_obj = pyfftw.builders.fftn(
A, s=shape, threads=threads)
self.fft_B_obj = pyfftw.builders.fftn(
B, s=shape, threads=threads)
self.ifft_obj = pyfftw.builders.ifftn(
self.fft_A_obj.get_output_array(), s=shape,
threads=threads)
else:
self.fft_A_obj = pyfftw.builders.rfftn(
A, s=shape, threads=threads)
self.fft_B_obj = pyfftw.builders.rfftn(
B, s=shape, threads=threads)
self.ifft_obj = pyfftw.builders.irfftn(
self.fft_A_obj.get_output_array(), s=shape,
threads=threads)
def __call__(self, A, B):
fft_padded_A = self.fft_A_obj(A)
fft_padded_B = self.fft_B_obj(B)
return self.ifft_obj(fft_padded_A * fft_padded_B)
N = 200
A = np.random.rand(N, N, N)
B = np.random.rand(N, N, N)
start_time = timeit.default_timer()
C = scipy.signal.fftconvolve(A,B,"same")
print timeit.default_timer() - start_time
custom_fft_conv_nthreads = CustomFFTConvolution(A, B, threads=1)
C = custom_fft_conv_nthreads(A, B)
print timeit.default_timer() - start_time
PyFFTW is approx. 7x slower than SciPy FFT which differs from other users experiences. What is wrong in this code? Python 2.7.9, PyFFTW 0.9.2.
You're not doing what you think you're doing, and what you think you're doing you shouldn't be doing either.
You're not doing what you think you're doing because your code above only defines
start_time
once (so your test for pyfftw includes not only the time consuming creation of theCustomFFTConvolution
object, but also the scipy convolution!).You shouldn't be doing what you think you're doing because you should use
timeit
to test this sort of thing.So, with some file
foo.py
:In ipython, you can get the following:
and with multiple threads:
If you correct your code to do what you think it's doing by inserting
start_time = timeit.default_timer()
beforeC = custom_fft_conv_nthreads(A, B)
, you get something closer to what is expected: