For the shallow neural net, the LM algorithm does amazingly well.
However, only MatLab and pyrenn (Python package) seems to have a robust implementation of it. A problem with both of these implementation is that they do not have GPU support for it. I also tried neupy (a python package) but it is not robust and fails when you try to train for longer epoch or large dataset. Do you guys know of a good LM python package for NN that can be trained using GPU?
I'm not really sure that such an implementation would be useful as for anything except for the most trivial networks, computing the Hessian is very impractical. However, I think that implementing Levenberg-Marquardt in pytorch for shallow networks isn't really that hard. At the very least, even if your implementation is not optimal you'll still get some speed ups compared to the CPU version. Here is a very quick and thus very imperfect/suboptimal example that runs on the GPU. The speedup is about 3x for inputs of size
3x20x20
on a1080ti
(I don't have free GPUs at the moment to test for larger inputs). However if you have a prior on the loss function (e.g. if it is least squares and you know that the Hessian can be approximated as2*Jacobian^t*Jacobian
) then some variant of the code below might be useful:Note that I took the code for
eval_hessian
from here.