Is static dispatch "almost always" faster than boxed dyn trait dispatch?

520 Views Asked by At

The pros/cons of using static vs dynamic dispatch have been addressed in a few related questions, e.g. 1, 2, 3. From a performance perspective, it is intuitive to assume that dynamic dispatch (via trait objects) can be slower than static dispatch, and in certain situations it is a sensible performance optimization to use statically bounded static dispatch (as e.g. suggested in this answer or the enum dispatch crate, which simplifies this pattern). The enum dispatch crate states for instance:

enum_dispatch transforms your trait objects into concrete compound types, increasing their method call speed up to 10x.

I'd assume there are basically two aspects that can make dynamic dispatch slower:

  • Dynamic dispatch requires some sort of boxing, which introduces a memory indirection. This typically affects code that uses these trait objects in collections like Vec, because a vector of pointers is less cache efficient than a vector of enums.
  • Dynamic dispatch prevents inlining functions calls.

This made me wonder if it is actually possible (or even likely?) to encounter situations where static dispatch is slower than dynamic dispatch?

For instance, what about a scenario in which the compiler wouldn't chose to inline the method call anyway, and the effect of the pointer redirection is negligible. In this case the performance in only determined by the dispatching itself. Dynamic dispatch uses a function pointer, so the complexity of the dispatching is independent of the number of possible dispatch branches. Static dispatching on the other hand uses some sort of match over an enum with N branches. How will the compiler handle that under that hood? Will it generate a potentially slow O(N) chain of comparisons to call the right method, or is it safe to assume that it can come up with optimized machine code that is still O(1)? Could using a vtable be actually faster in cases where the number of branches becomes really large?


Note: Of course, this question can and should be answered in each situation specifically by benchmarking the particular code. Nonetheless, I'd like to improve my mental model of what happens under the hood, and to develop a gut feeling whether static dispatch is slower than dynamic dispatch falls into either "almost never", "sometimes", "quite possible", ...

0

There are 0 best solutions below