There are some answered questions on fitting Bezier to data, but I'm looking for an algorithm rather than a ready solution. Something that I could implement according to other requirements.
Input: a small set of 2D data points, a dozen or so; the pre-defined Bezier endpoints.
Output: a planar cubic Bezier curve with the specified endpoints that minimizes mean square distance to the data points.
What I thought of: compute the square distance of a data point to a parametrized Bezier; take derivative over the parameter t; compute the distance where the derivative is 0; add that up for all the data points; take the derivative over the free control points; optimize that.
Where the problem lies: When you compute the derivate of the square distance from a point to a cubic Bezier curve you get a 5th degree equation. There is no analytic solution for that, so you have to Newton-Raphson iterate to get a numerical solution. Because of that when you are adding up the distance to all the data points and optimize the free Bezier control points you have to optimize numerically with a gradient-free algorithm, and that tends to be slow. This is doable, but will take way more compute time that should be necessary. What make the runtime issue even worse is that the problem needs to be solved for millions of Bezier curves, each fitting a dozen or so of its own data points. If each of the Beziers requires a numerical optimization wrapped around another numerical optimization that would make the solution unacceptably slow.
Therefore I'm looking for a reasonably fast, preferably analytic solution for Bezier curve fitting. If an exact solution is impossible then a reasonably fast numerical one would do.
Simple approach. Just use gradient descent to solve for the control points at the same time that you also solve for where the closest points are.
You are given
start_point, end_point, data_points. Come up with some semi-reasonable guess forc1, c2, twherec1andc2are control points, andt[i]is a parameter for a point on that Bezier curve which is fairly close todata_points[i]. And now use gradient descent to minimize the sum of the squares of the errors.