C++ Math: Fitting a complicated function is not working

496 Views Asked by At

I've been trying to get this fitting operation to work since more than 10 days. I wrote a C++ class for fitting using Levenberg-Marquardt algorithm, and I tested it with simple polynomials and the fit is a success; but with the real experimental function I need for my experiment, it's not working.

The same function I fitted with Mathematica, and it was OK to do it there (but slow, 12 seconds per fit, which is why I'm using C++). My program acts crazy and returns wrong results when I do the fit. Could you please check the functions I use and tell me whether anything is wrong?

The following is the procedure of fitting which I use in my class. Again, I'd like to mention that it worked with polynomials, so I could've have done a very deep mistake here that conflicts with only some special functions... that's my assumption.


The initial values I use are very close to the correct ones.

If you need any additional information, please let me know. Any help is very highly appreciated.

2

There are 2 best solutions below

12
On

Well, a quick look reveals that you should have

   parameters -= SamArray<T>::MatrixProduct(SamArray<T>::Inverse(JTJ + combinationMatrix),J); //add the new deltas to the parameters and test them

   instead of 

   parameters += SamArray<T>::MatrixProduct(SamArray<T>::Inverse(JTJ + combinationMatrix),J); //add the new deltas to the parameters and test them

That is, of course, assuming that J is the gradient and not the negative gradient.
This is the case with any Newton or quasi-Newton method (like Levenberg-Marquardt).

9
On

Rather than discussing the reasons for the code not giving you the result you expect, I'm going to address the speed issues. You have in your code

parameters += SamArray<T>::MatrixProduct(SamArray<T>::Inverse(JTJ + combinationMatrix),J);

which first calculates a matrix inverse and then multiplies by J, which I presume is another matrix. This is inefficient as it is fundamentally the same as solving A X = I first, then calculating A^-1 B, where A and X are matrices and I is the identity matrix. Instead, it is better to solve, A X = B, directly. There are several decomposition schemes that enable you to this, such as QR and LU. In LU decomposition, the matrix A is broken into an upper, U, and lower, L, triangular parts which are then efficiently inverted separately, as follows

A X = L U X = B
U X = L^-1 B
X = U^-1 L^-1 B

and this is performed as part of the algorithm itself. If you want the inverse, directly, then you would replace B with I, but as you can see, there is no need. Looking at the Levenberg–Marquardt algorithm, it may be the case that Transpose[J].J + DiagonalMatrix[Diagonal@J], using Mathematica notation, may be positive definite, then Cholesky decomposition is available, and it is significantly faster. But, I have not looked closely at the algorithm, so Cholesky decomposition may not be usable, in this case.

\begin{edit} In sec 1.5 of ch. 3, Stewart discusses the relative speeds of the two algorithms: invert-then-multiply v. LU decomposition. Both have the same asymptotic complexity, O(n^3), but they differ in the coefficient. Specifically, when solving the equation A X == B, invert-then-multiply has an operation count of 5 n^3 /6 + l n^2 while LU has n^3 /3 + l n^2 where n is the number of rows in B and l is the number of columns. The ratio of the two is always greater than one, and even when l == n, LU decomposition is 30% faster than invert-then-multiply. \end{edit}

Secondly, you are calculating Chi square twice for each set of parameters: once when they are determined initially, and again when you compare them to the next iteration. This value should be cached alongside the previous accepted parameters.

Lastly, I would not discount Mathematica's ability to handle fitting at a reasonable speed.