Why are the relative performance results in Google Benchmark completely different from raw loops?

37 Views Asked by At

I used Google Benchmark in the following way:

struct MyFixture : public benchmark::Fixture {

  void SetUp(const ::benchmark::State& state) {
      // do setup
  }

  void TearDown(const ::benchmark::State& state) {
  }
};


BENCHMARK_DEFINE_F(MyFixture, Test1)(benchmark::State& st) {

    for (auto _ : st) {
        //algorithm 1;
    }
}
BENCHMARK_REGISTER_F(MyFixture, Test1)->Arg(8);
BENCHMARK_DEFINE_F(MyFixture, Test2)(benchmark::State& st) {

    for (auto _ : st) {
        //algorithm 2
    }
}
BENCHMARK_REGISTER_F(MyFixture, Test2)->Arg(8);

I then wrote raw loop in the following way:

struct MyFixture {

  void SetUp(int n = 8) {
      // do setup
  }

  void TearDown() {
  }
};

int main() {
   double totalCount = 0;

   for (int i = 0; i < 1000000; i++) {
       MyFixture f;
       f.SetUp(8);
       
       auto start = std::chrono::high_resolution_clock::now();
       //algorithm 1
       auto end = std::chrono::high_resolution_clock::now();
       totalCount += std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count();
   }

   // print totalcount

   totalCount = 0;
   for (int i = 0; i < 1000000; i++) {
       MyFixture f;
       f.setup(8);
       
       auto start = std::chrono::high_resolution_clock::now();
       //algorithm 2
       auto end = std::chrono::high_resolution_clock::now();
       totalCount += std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count();
   }

   // print totalCount

   return 0;
}

The result in Google Benchmark and in raw loops are completely different.

In Google Benchmark, algorithm 1 is 8 times faster than algorithm 2.

However, in raw loops, algorithm 1 is 3 times slower than algorithm 2.

What are the possible reasons for this? Which result shall I trust?

I should trust the raw loop version right? So what's possibly wrong with the Google Benchmark ? (Or am I using the Google Benchmark correctly?)

Thanks.

0

There are 0 best solutions below