Algorithm to find desired direction with minimum amount of iterations

257 Views Asked by At

There's three components to this problem:

  1. A three dimensional vector A.
  2. A "smooth" function F.
  3. A desired vector B (also three dimensional).

We want to find a vector A that when put through F will produce the vector B.

F(A) = B

F can be anything that somehow transforms or distorts A in some manner. The point is that we want to iteratively call F(A) until B is produced.

The question is:

How can we do this, but with the least amount of calls to F before finding a vector that equals B (within a reasonable threshold)?

3

There are 3 best solutions below

1
On

I am assuming that what you call "smooth" is tantamount to being differentiable. Since the concept of smoothness only makes sense in the rational / real numbers, I will also assume that you are solving a floating point-based problem.

In this case, I would formulate the problem as a nonlinear programming problem. i.e. minimizing the squared norm of the difference between f(A) and B, given by

(F(A)_1 -B_1)² + (F(A)_2 - B_2)² + (F(A)_3 - B_3)²

It should be clear that this expression is zero if and only if f(A) = B and positive otherwise. Therefore you would want to minimize it.

As an example, you could use the solvers built into the scipy optimization suite (available for python):

from scipy.optimize import minimize

# Example function
f = lambda x : [x[0] + 1, x[2], 2*x[1]]

# Optimization objective
fsq = lambda x : sum(v*v for v in f(x))

# Initial guess
x0 = [0,0,0]

res = minimize(fsq, x0, tol=1e-6)

# res.x is the solution, in this case
# array([-1.00000000e+00,  2.49999999e+00, -5.84117172e-09])

A binary search (as pointed out above) only works if the function is 1-d, which is not the case here. You can try out different optimization methods by adding the method="name" to the call to minimize, see the API. It is not always clear which method works best for your problem without knowing more about the nature of your function. As a rule of thumb, the more information you give to the solver, the better. If you can compute the derivative of F explicitly, passing it to the solver will help reduce the number of required evaluations. If F has a Hessian (i.e., if it is twice differentiable), providing the Hessian will help as well.

As an alternative, you can use the least_squares function on F directly via res = least_squares(f, x0). This could be faster since the solver can take care of the fact that you are solving a least squares problem rather than a generic optimization problem.

From a more general standpoint, the problem of restoring the function arguments producing a given value is called an Inverse Problem. These problems have been extensively studied.

0
On

it seems like you can try a metaheuristic approach for this. Genetic algorithm (GA) might be the best suite for this. you can initiate a number of A vector to init a population, and use GA to make evolution on your population, so you will have better generation in which they have better vectors that F(Ax) closer to B. Your fitness function can be a simple function that compare F(Ai) to B You can choose how to mutate your population by each generation.

A simple example about GA can be found here link

0
On

Provided that F(A)=B, F,B are known and A remains unknown, you can start with a simple gradient search:

F(A)~= F(C) + F'(C)*(A-C)~=B

where F'(C) is the gradient of F() evaluated in point C. I'm assuming you can calculate this gradient analytically for now.

So, you can choose a point C that you estimate it is not very far from the solution and iterate by:

C= Co;
While(true)
{
   Ai = inverse(F'(C))*(B-F(C)) + C;
   convergence = Abs(Ai-C);
   C=Ai;

   if(convergence<someThreshold)
      break;
}

if the gradient of F() cannot be calculated analytically, you can estimate it. Let Ei, i=1:3 be the ortonormal vectors, then

   Fi'(C) = (F(C+Ei*d) - F(C-Ei*d))/(2*d);
   F'(C) = [F1'(C) | F2'(C) | F3'(C)];

and d can be chosen as fixed or as a function of the convergence value.

These algorithms suffer from the problem of local maxima, null gradient areas, etc., so in order for it to work, the start point (Co) must be not very far from the solution where the function F() behaves monotonically