JDK-8308023 - Exponential classfile blowup with nested try/finally

Mon Jul 31 09:44:03 UTC 2023

On 30/07/2023 22:24, Archie Cobbs wrote:
> In summary, JSR/RET is gone now, and the verifier also makes it 
> difficult to synthesize "subroutines". This makes the compiler's job a 
> lot harder wrt. finally blocks.

I tend to agree with this analysis. I think there are some tricks that 
could be done to reduce the overall size, or avoid combinatorial 
explosion, but, as observed in this thread, the caveat is that such 
tricks would only work with certain code shapes. There are two things 
that typically require duplication:

* the stack types might be different in the two exit points (because of 
the extra exception)
* the target instruction we should "jump to" immediately after executing 
the finalizer might vary depending on the exit point

This means that not all duplicated finalizers we see in the generated 
bytecode are really 100% duplicates. The case Jan brings up is a little 
different, as there's nested try/finally there (so the innermost 
exceptional exit point is repeated twice). That said, all really bad 
combinatorial cases will end up with nested try/finally in some shape 
and form, so perhaps targeting this specific idiom is a pragmatic 
compromise.

I just wanted to add a concrete note to the problem of performance 
w.r.t. shape of generated try/finally blocks. Look at this bug:

https://bugs.openjdk.org/browse/JDK-8267532

The problem here is that using a TWR is slower than not using it. The 
issue has to do with the fact that the compiler emits to calls to 
Resource::close, one in the "hot path" and one in the exceptional path. 
Unfortunately, the call to "close" in the exceptional path is not 
inlined (because it's never taken), so the compiler sees a way for the 
Resource to escape outside the method, meaning that now the resource 
object can no longer be scalarized (thus resulting in increase in GC 
activity).

So, my general sense is that having a combinatorial number of exit 
points (which will only rarely be taken) might be detrimental to 
performance anyway - as C2 will see a lot of "cold" code paths by which 
objects can escape. That said, even if javac deduplicates the exit 
points, I'm not sure that C2 might be able to keep the code in the same 
shape as generated by javac, as C2 likes to emit one version of the code 
for the "common, hot path", and one version of the code for the 
"uncommon, cold path" (which is typically executed when some invariants 
are violated). So, reducing bytecode footprint in javac might well 
result in more work for the C2 compiler which would have to disentangle 
the nest of try/finally blocks.

Maurizio