I'm using Google/benchmark for a project, and I just started playing around with the --benchmark_perf_counters
flag. Clearly I'm doing something wrong, since the perf counters are often negative. I'm assuming it's an issue w/ overflow, but I still don't quite understand how the counters work to begin with.
For example, how is CACHE-MISSES 0 on the first benchmark, and then -372k on the second? Neither of those values make sense to me.
(the two benchmarks have very similar parameters and runtime)
I'm running on Ubuntu 18.04 w/ an Intel(R) Xeon(R) Gold 6138 CPU. Google benchmark version is 1.6.1 and I have libpfm4-dev
installed. I'm calling my benchmark binary w/ --benchmark_perf_counters=CYCLES,INSTRUCTIONS,CACH-MISSES
-----------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
-----------------------------------------------------------------------------------------------------
bit::shift_left (small) (AA) 3.15 ns 3.15 ns 221185726 CACHE-MISSES=0 CYCLES=11.0005 INSTRUCTIONS=15
bit::shift_left (small) (UU) 2.65 ns 2.65 ns 254254663 CACHE-MISSES=-372.709k CYCLES=553.131k INSTRUCTIONS=372.709k
boost::shift_left (small) (AA) 2.71 ns 2.71 ns 258007443 CACHE-MISSES=-367.288k CYCLES=-367.288k INSTRUCTIONS=3.87586n
std::shift_left (small) 23.5 ns 23.5 ns 29812478 CACHE-MISSES=-3.17853M CYCLES=-102.703 INSTRUCTIONS=-972.747n
I had this same problem! Upgrading to the latest version of Google Benchmark (1.7.0) fixed it and now my perf counter results make sense.