Why does cpu time have usually larger fluctuations than real time in benchmarks?

1.1k Views Asked by At

When examining the output of my benchmarks with the Google Benchmark framework, I observed that the standard deviation of the measured cpu time was in many cases significantly larger than the standard deviation of the measured real time.

Why is that? Or is that result due to measurement errors? I am quite surprised about that because I expected the cpu time to be more reproducible.

This is a general observation on my system. I nevertheless provide a simple example:

#include <benchmark/benchmark.h>
#include <cmath>

static void BM_SineEvaluation(benchmark::State& state)
{
    for (auto _ : state)
    {
        double y = 1.0;
        for (size_t i = 0; i < 100; ++i)
        {
            y *= std::sin(y) * std::sin(y) + std::cos(y) * std::cos(y);
            y+= std::sin(std::cos(y));
        }
        benchmark::DoNotOptimize(y);
    }
}

BENCHMARK(BM_SineEvaluation);

The example does not even contain heap allocations. None of the sin/cos functions are optimized away by the compiler. That's all of the code. The time measurements are completely done inside of the Google Benchmark library, which is openly available on github. But I haven't looked into the implementation so far.

When running the program with command-line arguments --benchmark_repetitions=50 --benchmark_report_aggregates_only=true, I get an output like this:

----------------------------------------------------------------
Benchmark                         Time           CPU Iterations
----------------------------------------------------------------
BM_SineEvaluation_mean        11268 ns      11270 ns      64000
BM_SineEvaluation_median      11265 ns      11230 ns      64000
BM_SineEvaluation_stddev         11 ns         90 ns      64000

I am using Google Benchmark v1.4.1 on a really old Intel Core i7 920 (Bloomfield) with Microsoft (R) C/C++ Optimizing Compiler Version 19.00.24218.1 for x86 (Visual Studio 2015) with /O2.


Edit: I did further measurements on Fedora-28 on Intel i5-4300U CPU with gcc-8.1.1 (which is smart enough to call sincos with -O2) and found a contrasting behavior:

----------------------------------------------------------------
Benchmark                         Time           CPU Iterations
----------------------------------------------------------------
BM_SineEvaluation_mean        54642 ns      54556 ns      12350
BM_SineEvaluation_median      54305 ns      54229 ns      12350
BM_SineEvaluation_stddev        946 ns        888 ns      12350 

When omitting -O2 (which is more close to MSVC because it has separate sin/cos calls), I still get the same qualitative result: the standard deviation of the real time is also larger than the standard deviation of the cpu time.

I am not quite sure what conclusion to draw from that. Does this mean that the time measurements on Windows are just less precise?

0

There are 0 best solutions below