requestAnimationFrame [now] vs performance.now() time discrepancy

3.9k Views Asked by At

Assumptions: rAF now time is calculated at the time the set of callbacks are all triggered. Therefore any blocking that happens before the first callback of that frame is called doesn't affect the rAF now and it's accurate--at least for that first callback.

Any performance.now() measurements made before a rAF set is triggered should be earlier than rAF now.

Test: Record before (a baseline time before anything happens). Set the next rAF. Compare rAF now and actual performance.now() to before to see how different they are.

Expected results:

var before = performance.now(), frames = ["with blocking", "with no blocking"], calls = 0;
requestAnimationFrame(function frame(rAFnow) {
  var actual = performance.now();
  console.log("frame " + (calls + 1) + " " + frames[calls] + ":");
  console.log("before frame -> rAF now: " + (rAFnow - before));
  console.log("before frame -> rAF actual: " + (actual - before));

  if (++calls < frames.length) { before = actual; requestAnimationFrame(frame); }
});

// blocking
for (var i = 0, l = 0; i < 10000000; i++) {
    l += i;
}

Observations: When there is blocking before the frame starts, the rAF now time is at times incorrect, even for that first frame. Sometimes the first frame's now is actually an earlier time than the recorded before time.

Whether there is blocking happening before the frame or not, every so often the in-frame time rAFnow will be earlier than the pre-frame time before--even when I setup the rAF after I take my first measurement. This can also happen without any blocking whatsoever, though this is rarer.

(I get a timing error on the first blocking frame most of the time. Getting an issue on the others is rarer, but does happen occasionally if you try running it a few times.)

With more extensive testing, I've found bad times with blocking prior to callback: 1% from 100 frames, no blocking: 0.21645021645021645% from ~400 frames, seemingly caused by opening a window or some other potentially CPU-intensive action by the user.

So it's fairly rare, but the problem is this shouldn't happen at all. If you want to do useful things with them, simulating time, animation, etc., then you need those times to make sense.

I've taken into account what people have said, but maybe I am still not understanding how things work. If this is all per-spec, I'd love some psuedo-code to solidify it in my mind.

And more importantly, if anyone has any suggestions for how I could get around these issues, that would be awesome. The only thing I can think of is taking my own performance.now() measurement every frame and using that--but it seems a bit wasteful, having it effectively run twice every frame, on top of any triggered events and so on.

4

There are 4 best solutions below

0
On

I encountered the same issue on chrome, where calls to performance.now () would return a higher value than the now value passed into subsequent callbacks made by window.requestAnimationFrame ()

My workaround was to set the before using the now passed to the callback in the first window.requestAnimationFrame () rather than performance.now (). It seems that using only one of the two functions to measure time guarantees progressing values.

I hope this helps anyone else suffering through this bug.

1
On

General description

I did a in-depth investigation of the issue that requestAnimationFrame can report time earlier than previously invoked performance.now(), even with high-precision timers (when COOP/COEP is enabled). For example, when your app is under heavy CPU load, the following code:

let prev = 0
function onFrame(t) {
    if (t < prev) {
        console.log("FAIL!", t - prev)
    }
    prev = performance.now() 
    window.requestAnimationFrame(onFrame)
}
window.requestAnimationFrame(onFrame)

Can report:

FAIL! -1.4840001144402777
FAIL! -0.6779997711182659
FAIL! -1.3689998474110325
FAIL! -0.5829999618545116
FAIL! -0.4240000762947602
FAIL! -1.018999790190719
FAIL! -0.056999980926775606
FAIL! -5.993999904632801
FAIL! -0.26899996185238706
FAIL! -0.9029996757490153
FAIL! -1.6019999237068987
....

It looks, like this is the CORRECT behavior, at least in Chrome. Handling of events, animation frames, and scheduling tasks is an implementation detail of every browser. We did an in-depth analysis of how Chrome handles these things. Other browsers' behavior might differ.

Events scheduling in Chrome

The following graph shows an example order of tasks per frame with the assumptions that all the frame tasks can be executed within that frame and the application runs with native FPS.

             ║    Frame 1     ║    Frame 2     ║    Frame 3     ║ ...
 Idle time   ║█            ███║██         █████║████           █║ ...
 Other tasks ║ █          █   ║  █       █     ║    █        █  ║ ...
 I/O Events  ║  ██   █████    ║      ████      ║     █   ████   ║ ...
 RAF         ║    ███         ║   ███          ║      ███       ║ ...
 GPU         ║      ████      ║     ███        ║        ██████  ║ ...

A few things should be noted here:

  • Frame start idle time. At the beginning of almost every frame, there will be an idle time of up to a few milliseconds (we've observed idle times between 0 ms and 4 ms). This time is likely caused by frames being synchronized with the screen refresh rate, while the tasks are performed by the Chrome task scheduler, which is not fired at the synchronization points. To learn more, see this link. It is not clear whether Chrome considers this a bug or if they will ever fix it.

  • Other tasks. These are Chrome-internal tasks. For example, before every animation frame, Chrome performs some preparation work, which in most cases is very short (we've observed times between 20 μs and 40 μs). At the end of the frame, Chrome performs pre-paint, layout, layerize, and commit jobs, which, in the case of a WebGL applications as ours (when no HTML elements are on the stage), are also short, ranging between 0.1 ms and 0.5 ms.

  • I/O Events. Events such as mouse down/up/move can be scheduled before or after the RAF callback. Some of these events might originate in the previous frame, and their handling might be scheduled for the next frame by Chrome.

  • RAF. The RAF (Request Animation Frame) callback, which in case of a WebGL application would trigger per-frame functions, such as animations, display object layout recomputation, WebGL buffer updates, and invoking WebGL draw calls.

  • GPU. The GPU-related work starts at the end of the RAF callback, as soon as the first WebGL calls are issued. It is performed in parallel with the rest of the RAF callback, I/O events, and other tasks performed by Chrome.

In case the total time of all the per-frame tasks is longer than the frame duration with the native refresh rate, some frames might be skipped, and a very interesting time report behavior can be observed. Let's consider the following events graph:

             ║ Real frame 1 work  ┆          Real frame 2 work         ┆        ...
             ║    Frame 1     ║   ┆Frame 2     ║ Frame 3 (skip) ║    Frame 4     ║  ...
 Idle time   ║███             ║   ┆            ║                ║      ┆         ║  ...
 Other tasks ║   █            ║  █┆            ║                ║     █┆         ║  ...
 RAF         ║    ██████████  ║   ┆  ██████████████████████████████    ┆ ██████████ ...
 I/O Events  ║              █████ ┆██          ║                ║  ██  ┆█        ║  ...
 GPU         ║              ███   ┆            ║                ║     █┆         ║  ...

A few things should be noted here:

  • Frame start idle time. As previously mentioned, we can observe an idle time at the beginning of the first frame. The Chrome task scheduler rendered the previous frame in time, so there was no pressure to start the next tasks immediately. However, there is no idle time present in the next frames as their computations took longer than the native frame duration.

  • Frame times. Even if the real computation time can exceed the time of a few frames issued with the native refresh rate, the frames are always scheduled with the native refresh rate time. Thus, some of these frames might be skipped, like frame 3 in the example above.

  • RAF start time. When the RAF callback is called, it is provided with the frame start time. However, this time might be smaller than the time reported by performance.now(). For example, in the graph above, the third call to the RAF callback will report the beginning time of frame 4, even though it was performed in the middle of frame 4. If we measured the time at the end of the second RAF callback with performance.now(), this time would be bigger than the time reported by the third RAF callback.

0
On

Assumptions: rAF now time is calculated at the time the set of callbacks are all triggered.

That's the issue. Even if it's what the specs ask for, it's not what browsers do. In systems supporting VSync, now will be the time when the monitor sent its last VSync open signal. It is not linked to the CPU nor to your JS execution.

The browser will assume that until it hasn't received a new VSync signal, it still has time to send new data to the compositor, which will actually send it to the monitor the next frame. Basically, it has a one frame buffer1.
So in the case of a long frame, the one where you do block the JS thread, it will try to produce the next frame fast enough so that both frames can be passed to the monitor in time.

For instance if we take the following diagram,

VSync open signals      :  |-----------------|-----------------|-----------------|----------
JS callbacks            :  |long-frame------------|short-frame |next-frame----   |...
Presentation to monitor :  [previous frame  ][long frame      ][short frame     ][next frame  

we can see that the short frame would still have time to be presented to the monitor in its own presentation frame, and that the next frame would even have caught up with the delay. But the now passed in short frame would indeed report a time before the end of long frame.

If short frame takes too long to be rendered in its own presentation frame, it will take the slot of next frame, and its own presentation frame would be skipped.

1. There are some APIs that are able to bypass the compositor and thus this one frame buffer, for instance canvas2D has a desynchronized option, but this is only available in some systems.

13
On

Please see the more recent answers below for some very detailed analysis of where the timer values come from.


The timestamp passed in to the `requestAnimationFrame()` callback is the time of the beginning of the animation frame. Multiple callbacks being invoked during the same frame all receive the same timestamp. Thus, it would be really weird if `performance.now()` returned a time *before* the parameter value, but not really weird for it to be *after* that.

Here's the relevant specification:

When the user agent is to run the animation frame callbacks for a Document document with a timestamp now, it must run the following steps:

  1. If the value returned by the document object’s hidden attribute is true, abort these steps. [PAGE-VISIBILITY]

  2. Let callbacks be a list of the entries in document’s list of animation frame callbacks, in the order in which they were added to the list.

  3. Set document’s list of animation frame callbacks to the empty list.

  4. For each entry in callbacks, in order: invoke the Web IDL callback function, passing now as the only argument, and if an exception is thrown, report the exception.

So you've registered a callback (let's say just one) for the next animation frame. Tick tick tick, BOOM, time for that animation frame to happen:

  1. The JavaScript runtime makes a note of the time and labels that now.
  2. The runtime makes a temporary copy of the list of registered animation frame callbacks, and clears the actual list (so that they're not accidentally invoked if things take so long that the next animation frame comes around).
  3. The list has just one thing in it: your callback. The system invokes that with now as the parameter.
  4. Your callback starts running. Maybe it's never been run before, so the JavaScript optimizer may need to do some work. Or maybe the operating system switches threads to some other system process, like starting up a disk buffer flush or handling some network traffic, or any of dozens of other things.
  5. Oh right, your callback. The browser gets the CPU again and your callback code starts running.
  6. Your code calls performance.now() and compares it to the now value passed in as a parameter.

Because a brief but non-ignorable amount of time may pass between step 1 and step 6, the return value from performance.now() may indicate that a couple of microseconds, or even more than a couple, have elapsed. That is perfectly normal behavior.