devirtualize: to change a virtual/polymorphic/indirect function call into a static function call due to some guarantee that the change is correct -- source: myself
Given a simple trait object, &dyn ToString, created with a statically known type, String:
fn main() {
let name: &dyn ToString = &String::from("Steve");
println!("{}", name.to_string());
}
Does the call to .to_string() use <String as ToString>::to_string() directly? Or only indirectly via the trait's vtable? If indirectly, would it be possible to devirtualize this call? Or is there something fundamental that hinders this optimization?
The motivating code for this question is much more complicated; it uses async trait functions and I'm wondering if returning a Box<dyn Future> can be optimized in some cases.
No.
Rust is a language, it doesn't do anything; it only prescribes semantics.
In this specific case, the Rust language doesn't prescribe devirtualization, so an implementation is permitted to do it.
At the moment, the only stable implementation is rustc, with the LLVM backend -- though you can use the cranelift backend if you feel adventurous.
You can test your code for this implementation on the playground and select "Show LLVM IR" instead of "Run", as well as "Release" instead of "Debug", you should be able to check that there is no virtual call.
A revised version of the code isolates the cast to trait + dynamic call to make it easier:
Which when run on the playground yields among other things:
Where you can clearly see that the call to
ToString::to_stringhas been replaced by a simple call to<String as Clone>::clone; a devirtualized call.Unfortunately, you cannot draw any conclusion from the above example.
Optimizations are finicky. In essence, most optimizations are akin to pattern-matching+replacing using regexes: differences that to human look benign may completely throw off the pattern-matching and prevent the optimization to apply.
The only way to be certain that the optimization is applied in your case, if it matters, is to inspect the emitted assembly.
But, really, in this case, I'd be more worried about the memory allocation than about the virtual call. A virtual call is about 5ns of overhead -- though it does inhibit a number of optimization -- whereas a memory allocation (and the eventual deallocation) routinely cost 20ns - 30ns.