Benchmark function with input that doesn't implement the Copy trait

558 Views Asked by At

I am running benchmarks with Criterion but am facing issues with functions that have an input that does not implement the Copy trait.

For example, I have set up the following benchmark for a function with the signature pub fn hash(vector: Vec<&str>) -> u64.

pub fn criterion_benchmark(c: &mut Criterion) {
    let s: String = String::from("Hello World!");
    let tokens: Vec<&str> = hashing::tokenize(&s);
    c.bench_function(
        "hash",
        |b| b.iter(|| {
            hashing::hash(tokens)
        }),
    );
}

However, unlike with types that have the Copy trait, the compiler throws out the following ownership error.

error[E0507]: cannot move out of `tokens`, a captured variable in an `FnMut` closure
  --> benches/benchmark.rs:17:34
   |
13 |     let tokens: Vec<&str> = hashing::tokenize(&s);
   |         ------ captured outer variable
...
17 |             hashing::hash(tokens)
   |                                  ^^^^^^ move occurs because `tokens` has type `Vec<&str>`, which does not implement the `Copy` trait

error[E0507]: cannot move out of `tokens`, a captured variable in an `FnMut` closure
  --> benches/benchmark.rs:16:20
   |
13 |     let tokens: Vec<&str> = hashing::tokenize(&s);
   |         ------ captured outer variable
...
16 |         |b| b.iter(|| {
   |                    ^^ move out of `tokens` occurs here
17 |             hashing::hash(tokens)
   |                                  ------
   |                                  |
   |                                  move occurs because `tokens` has type `Vec<&str>`, which does not implement the `Copy` trait
   |                                  move occurs due to use in closure

How can non-copyable inputs be passed to the benchmarked function without running into ownership issues?

2

There are 2 best solutions below

1
bergwald On BEST ANSWER

As suggested by @Stargateur, cloning the parameter solved the ownership issue.

pub fn criterion_benchmark(c: &mut Criterion) {
    let s: String = String::from("Hello World!");
    let tokens: Vec<&str> = hashing::tokenize(&s);
    c.bench_function(
        "hash",
        |b| b.iter(|| {
            hashing::hash(tokens.clone())
        }),
    );
}

However, as proposed by @DenysSéguret and @Masklinn, changing the hash function to accept &[&str] avoids the ~50% overhead of cloning the vector.

0
Svetlin Zarev On

Cloning the input can result in severe errors in the benchmark result.

Therefore you should use iter_batched() instead of iter()

use criterion::{black_box, criterion_group, criterion_main, BatchSize, BenchmarkId, Criterion};
use std::collections::hash_map::DefaultHasher;
use std::hash::{Hash, Hasher};

criterion_main!(benches);
criterion_group!(benches, criterion_benchmark);

fn criterion_benchmark(c: &mut Criterion) {
    let input_data: String = String::from("Hello World!");

    c.bench_function("bench_function", |bencher| {
        bencher.iter_batched(
            || init_data(&input_data),
            |input| {
                let x = benchmark_me(input);
                black_box(x);
            },
            BatchSize::SmallInput,
        );
    });

    c.bench_function("bench_function+clone", |bencher| {
        bencher.iter(|| {
            let x = benchmark_me(init_data(&input_data));
            black_box(x);
        });
    });
}

fn init_data(s: &str) -> Vec<&str> {
    // it's intentionally slower than a plain copy, to make the difference more visible!
    s.split_ascii_whitespace().collect()
}

fn benchmark_me(s: Vec<&str>) -> u64 {
    let mut hasher = DefaultHasher::new();
    s.hash(&mut hasher);
    hasher.finish()
}

Results:

bench_function                  time:   [99.520 ns 100.90 ns 102.23 ns]                           
bench_function+clone            time:   [210.41 ns 212.08 ns 213.77 ns]                                                                                                     ```