Does anyone know how to speed up this simple code running in parallel (with parfor)?

138 Views Asked by At

I do time-consuming simulations involving the following (simplified) code:

K=10^5; % large number
L=1000; % smaller number

a=rand(K,L);
b=rand(K,L);
c=rand(L,L);
d=zeros(K,L,L);

parfor m=1:L-1
    
    e=zeros(K,L);
    
    for n=m:L-1
        
        e(:,n+1)=e(:,n)+(n==m)+e(:,n).*a(:,n).*b(:,n)+a(:,1:n)*c(n,1:n)';
        
    end
    
    d(:,:,m)=e;
end

Does anyone know how to speed up this simple code running in parallel (with parfor)?

Since each worker requires matrices a and b and c, there is a large parallel overhead.

The overhead is smaller if I send each worker only the parts of the matrix b it needs (since the inner loop starts at m), but that doesn't make the code very much faster, I think.

Because of the large overhead, parfor is slower than the serial for-loop. As parfor iterations increase (increasing L), the sizes of a, b, and c also increase, and so does the overhead. Therefore, I do not expect the parfor loop to be faster even for large values of L. Or does anyone see it differently?

1

There are 1 best solutions below

7
On BEST ANSWER

There may be a performance gain using pre-computation:

tc = tril(c);
ac = a * tc.';
ab = a .* b;
for m=1:L-1
    e = zeros(K,L);
    for n=m:L-1
        e(:, n + 1) = e(:, n) + (n==m) + e(:, n) .* ab(:, n) + ac(:, n);
    end
    d(:,:,m) = e;
end