Optimizing the composition of two fuctions

34 Views Asked by At

While developing my programming skills, I often run into information along the lines of "Don't chain these functions together. Instead, use the built-in function that does both much faster." To use a fake example:

Use average(X) instead of sum(X)/len(X)

Is there something that would stop a compiler from recognizing that sum(X)/len(X) is the same as average(X) and applying the same optimizations?

Or is optimizing compositions of functions analogous to Lisp macros, where you're working on another layer of complexity compared to simply optimizing functions?

1

There are 1 best solutions below

1
Mike Nakis On
  • The compiler will only optimize sum(x)/len(x) when optimizations are enabled. Optimizations are usually enabled only on release builds, whereas on debug builds we refrain from applying most optimizations so as to not render the code un-debuggable. Therefore, using average(x) will save time on your debug builds, which are, incidentally, quite often also your test builds, meaning that your tests will run faster.

  • sum()/len() will be optimized as average() only if the compiler has built-in knowledge of all three of these functions. Note that just because a function is part of the standard library it does not mean that the compiler has built-in knowledge of it. Most likely it doesn't. In the original C language the compiler had built-in knowledge of zero standard library functions. In the modern day most C compilers provide built-in ("intrinsic") implementations of some functions, but you need to check with your specific compiler.

  • For functions that the compiler has no built-in knowledge, the compiler may inline sum(), then it may inline len(), then it may recognize the emerging code pattern which permits an additional optimization, and perform that optimization, essentially producing an inlined version of average(). Note the use of the word "may"; it may happen and it may not happen.

  • However, unless the CPU you are targeting has vector instructions, any functions that contain loops may be disqualified from being inlined under certain optimization strategies. For example, it may be that when optimizing for smaller code size a compiler may avoid inlining functions that contain loops. (And optimizations that favor small code size are preferred by many because a) they include many of the optimizations that yield faster code, and b) they result in better CPU cache utilization.)

That having been said, the usual disclaimers apply:

  • Do not worry so much about micro-optimizations. The optimizations that really matter tend to be algorithmic or architectural optimizations. Micro-optimizations in the code are usually peanuts.
  • Do not optimize anything unless you have sufficient reason to believe that it represents not just overhead but an overhead bottleneck.
  • Favor readability over performance whenever possible.

etc.