Dynamic programming approximation

243 Views Asked by At

I am trying to calculate a function F(x,y) using dynamic programming. Functionally:

F(X,Y) = a1 F(X-1,Y)+ a2 F(X-2,Y) ... + ak F(X-k,Y) + b1 F(X,Y-1)+ b2 F(X,Y-2) ... + bk F(X,Y-k)

where k is a small number (k=10).

The problem is, X=1,000,000 and Y=1,000,000. So it is infeasible to calculate F(x,y) for every value between x=1..1000000 and y=1..1000000. Is there an approximate version of DP where I can avoid calculating F(x,y) for a large number of inputs and still get accurate estimate of F(X,Y).

A similar example is string matching algorithms (Levenshtein's distance) for two very long and similar strings (eg. similar DNA sequences). In such cases only the diagonal scores are important and the far-from-diagonal entries do not contribute to the final distance. How do we avoid calculating off-the-diagonal entries?

PS: Ignore the border cases (i.e. when x < k and y < k).

2

There are 2 best solutions below

0
On

Without knowing more about your specific problem, the general approach is to use a top-down dynamic programming algorithm and memoize the intermediate results. That way you will only calculate the values that will be actually used (while saving the result to avoid repeated calculations).

0
On

I'm not sure precisely how to adapt the following technique to your problem, but if you were working in just one dimension there is an O(k3 log n) algorithm for computing the nth term of the series. This is called a linear recurrence and can be solved using matrix math, of all things. The idea is to suppose that you have a recurrence defined as

  • F(1) = x_1
  • F(2) = x_2
  • ...
  • F(k) = x_k
  • F(n + k + 1) = c_1 F(n) + c_2 F(n + 1) + ... + c_k F(n + k)

For example, the Fibonacci sequence is defined as

  • F(0) = 0
  • F(1) = 1
  • F(n + 2) = 1 x F(n) + 1 x F(n + 1)

There is a way to view this computation as working on a matrix. Specifically, suppose that we have the vector x = (x_1, x_2, ..., x_k)^T. We want to find a matrix A such that

Ax = (x_2, x_3, ..., x_k, x_{k + 1})^T

That is, we begin with a vector of terms 1 ... k of the sequence, and then after multiplying by matrix A end up with a vector of terms 2 ... k + 1 of the sequence. If we then multiply that vector by A, we'd like to get

A(x_2, x_3, ..., x_k, x_{k + 1})^T = (x_3, x_4, ..., x_k, x_{k + 1}, x_{k + 2})

In short, given k consecutive terms of the series, multiplying that vector by A gives us the next term of the series.

The trick uses the fact that we can group the multiplications by A. For example, in the above case, we multiplied our original x by A to get x' (terms 2 ... k + 1), then multiplied x' by A to get x'' (terms 3 ... k + 2). However, we could have instead just multiplied x by A2 to get x'' as well, rather than doing two different matrix multiplications. More generally, if we want to get term n of the sequence, we can compute Anx, then inspect the appropriate element of the vector.

Here, we can use the fact that matrix multiplication is associative to compute An efficiently. Specifically, we can use the method of repeated squaring to compute An in a total of O(log n) matrix multiplications. If the matrix is k x k, then each multiplication takes time O(k3) for a total of O(k3 log n) work to compute the nth term.

So all that remains is actually finding this matrix A. Well, we know that we want to map from (x_1, x_2, ..., x_k) to (x_1, x_2, ..., x_k, x_{k + 1}), and we know that x_{k + 1} = c_1 x_1 + c_2 x_2 + ... + c_k x_k, so we get this matrix:

    | 0   1   0   0    ...   0 |
    | 0   0   1   0    ...   0 |
A = | 0   0   0   1    ...   0 |
    |        ...               |
    | c_1 c_2 c_3 c_4  ... c_k |

For more detail on this, see the Wikipedia entry on solving linear recurrences with linear algebra, or my own code that implements the above algorithm.

The only question now is how you adapt this to when you're working in multiple dimensions. It's certainly possible to do so by treating the computation of each row as its own linear recurrence, then to go one row at a time. More specifically, you can compute the nth term of the first k rows each in O(k3 log n) time, for a total of O(k4 log n) time to compute the first k rows. From that point forward, you can compute each successive row in terms of the previous row by reusing the old values. If there are n rows to compute, this gives an O(k4 n log n) algorithm for computing the final value that you care about. If this is small compared to the work you'd be doing before (O(n2 k2), I believe), then this may be an improvement. Since you're saying that n is on the order of one million and k is about ten, this does seem like it should be much faster than the naive approach.

That said, I wouldn't be surprised if there was a much faster way of solving this problem by not proceeding row by row and instead using a similar matrix trick in multiple dimensions.

Hope this helps!