How does the Ableton warp algorithm work exactly?

4.4k Views Asked by At

I'm looking for any documentation or definitive information on Ableton's warp feature. I understand that it has something to do with finding transients, aligning them with an even rhythm and shifting audio samples accordingly. I'm hoping to find ways to approximate warping with more basic audio editing tools.

I understand that this is ableton's unique device, really any information about how it works would be helpful.

So...does anyone have any 411?

3

There are 3 best solutions below

0
On

Here's a simple version of such an algorithm implemented in Max/MSP, open source:

http://cycling74.com/toolbox/kneppers-granular-stretcher/

0
On

The auto-warp feature in ableton live consists basically of two processing steps: detecting beats with an automatic beat detection algorithm and dynamically changing the tempo according to the beat information.

For the tempo detection, they licensed an older version of zplane aufTAKT.

ableton live offers several algorithms for time-stretching. Most of them work in the time domain (compare: overlap and add (OLA) algorithms). Two of them, "Complex" and "Complex Pro" are licensed from zplane as well (compare the zplane élastique algorithms). They are not time-domain algorithms. To learn more about frequency domain algorithms, "Phase Vocoder" would be the best google start. An excellent introduction to the theory of time stretching and pitch shifting can be found in Zölzer's DAFX book.

7
On

"Warping" the audio is to be able to change the speed of it without changing the pitch. Ableton Live has a handful of algorithms to do this, each optimized for different types of content. I'll explain how it works from a generic level.

Audio is usually captured and quantified with samples. The pressure level is measured for a short period of time. Each measurement (sample) is taken and played back very rapidly. (44.1kHz for CD audio) This means that the audio signal is in the time domain.

If we simply speed up something recorded in the time domain, we change its pitch as well, since frequency is closely related. What we need to do is convert the audio from time domain into the frequency domain. That is, rather than capturing the general pressure level for a sample, we will instead capture what frequencies are present.

To do this, first we lower the sample rate considerably. Usually to around 10ms or so. This gives us enough time to run an fourier transform (implemented as FFT usually) on the sample window and get fairly useful results. Lower frequencies are usually rolled off (since they don't fit within the window well), so various algorithms are used to boost them. These algorithms usually look at windows nearby.

Anyway, what we end up with is various frequencies present for windows. That means, to speed up the audio, we just playback each window for a shorter time, and to slow down the auydio, we playback each window for a longer time. Each window has a little snapshot of the frequencies that are present within it.

There are also a lot of fixes to this method, to make things sound better, but this is how it works generally.

Also note that MP3 encoding works the exact same way.