I've used Dynamic Time Warping algorithms to produce distance matrices and find the differences between two arrays. However, is there an algorithm that can actually warp one array to be like the other (as an end result I'm looking to warp one video to be like another i.e. working with multidimensional arrays).
Context: I have many videos of people doing golf swings and I need to analyse them in terms of similarity. However, many of the swings are in slow motion whilst some are in real time. On top of this, the clips start and stop at varying times before and after the swing and each of the videos have the golfer standing in different positions in the frame itself.
However, for explanation's sake, take the following simple example:
a = [1, 2, 3]
b = [1, 1, 2, 2, 3, 3]
Becomes:
a = [1, 2, 3]
b = [1, 2, 3]
When b
is warped to be like a
.
Explanation: b
is just a time dilated version of a
e.g. a slow motion version of a video
Here's another example:
a = [1, 2, 3]
b = [3, 4, 6]
Becomes:
a = [1, 2, 3]
b = [1, 2, 4]
When b
is warped to be like a
.
Explanation: b
is just a time translated version of a
with a bit of error e.g. a video taken from a lower height and angle
My end goal would be an algorithm that could combine both the dilation and translation into one, if that's possible
If I remember correctly, dynamic time warping is a dynamic programming algorithm. As such, if you imagine the algorithm as running on a matrix, where the rows are the letters of string a and the columns are the letters of string b, the scores are computed one cell at a time row by row through the matrix.
If in addition to the score, you store at each cell a pointer going back to the cell that score is derived from, then you can use back propagation through the distance matrix to reconstruct a path that gives the best score.
Back propagation just finds a path through those pointers back from the end cell of the matrix to the start cell with depth first search.
To map one string to another, you look at the directions those pointers take, horizontal ones add to one string, vertical strings remove from that string (add to the other) (alternately, both horizontal and vertical add to one string + remove from the other), and diagonals match the letters in both strings.
This is a very standard approach for dynamic programming algorithms in general, so I bet it can be tailored to work for dynamic time warping.