I know Java has yet to add tail-call elimination optimization and intends to do so as a late stage addition to Project Loom. My question is instead: does the JIT ever optimize away recursive method calls in their entirety and convert them into an iterative form? It seems nominally possible but relatively difficult so I'm guessing if they did it would be strongly described in some doc, but I'm struggling to track anything on the subject down.
As a follow-up, if the JIT does eliminate recursive calls in some form beyond what's described in Loom, how does that appear on stack traces?
Your question “how does that appear on stack traces?” is one of the unsolved problems of tail call optimization in a JVM. There’s an expectation of manageability of the Java code, especially when code is spending a long time in a recursive algorithm (or even hanging in an endless recursion) and the developer wants to find out what’s going on.
So in short, the JVM (current version of HotSpot) has no tail call optimization. But it is capable of inlining a finite number of recursive invocations, just like it can inline other invocations.
When we use
we get values similar to those discussed in Why is the max recursion depth I can reach non-deterministic? Running with
-XX:CompileCommand=print,Recursion.recursiveCounter
-XX:+UnlockDiagnosticVMOptions
-XX:+PrintInlining
printsand
The interesting part is the
test edx,edx
instruction matching ourlimit == 0
condition. If this condition is not fulfilled, there’s acmp edx,1h
test following, effectively testing what the next recursive invocation would test and if this condition also isn’t fulfilled, the code will performadd r8d,2h
andadd edx,0fffffffeh
, incrementing thecount
by two and decrementing thelimit
by two.So we clearly see the effect of inlining one level of recursion and fusing the operations. The inlining diagnostics tells us that a limit has been crossed, so it’s worth to investigate what happens when specifying, e.g.
-XX:MaxRecursiveInlineLevel=4
:We can see that now, the compiled code has learned to add up to five at once. The application will also print a much larger limit of nested invocations. The comments behind the instruction show that the meta information about the nested invocations is still present, even if there are no individual instructions associated with an invocation level. This implies that a stack trace representing the whole original call chain could still be constructed.
Of course, this is not the same as actual tail call optimization. Regardless of how high we set the limit, it’s still inlining a finite number of invocation and it’s already recognizable that setting the limit too high would yield unreasonable code size.
So the answer to the literal question “Does the Java JIT ever optimize away recursive method calls?” is “Yes, it does”. But not the entire recursion but a finite number of calls, in other words, not in the form of tail call optimization.