My program written in Julia does not yield the expected computational performance. Basically the program first computes the Cholesky decomposition of a large matrix A using cholfact!
, so that A = L'L. Then it solves Lx = b for different b using the backslash operator.
This results in straight calls to Lapack. The function cholfact!
is implemented by pstrf!
and the backslash operator uses trtrs!
. These are the correct Lapack functions to use. While the function pstrf!
is executed in parallel, the function trtrs!
is not. The profiler tells me that most of the runtime is spent on trtrs!
. The lines of code in my program are
F = cholfact!(A, :L, pivot = true) # precomputation, executed once
and
x = F[:L]\b[F.piv] # inside a loop, b is computed from x every step
Why is there a difference between the two Lapack functions? How can I get parallel execution of pstrf!
?