RFR(M): 8234500: [lworld] Multiple failures with -XX:+DeoptimizeALot

Wed Dec 11 10:46:59 UTC 2019

Hi,

please review the following patch:
https://bugs.openjdk.java.net/browse/JDK-8234500
http://cr.openjdk.java.net/~thartmann/8234500/webrev.00/

This patch fixes all the places in the JITs where inline type buffering might trigger deoptimization
and re-execution of the corresponding bytecode in the interpreter is necessary for correctness.

There's still a problem with the calling convention and C1 that is hard to fix. The problem is that
when calling from C2 to C1 compiled code (i.e, no adapter in between) we need to buffer scalarized
value type arguments in the C1 compiled entry point of the method (because C1 requires an oop). To
do that, we might need to call into the runtime and that call might trigger deoptimization of the C1
compiled caller. That's a problem because the arguments are still scalarized and neither the deopt
code nor the interpreter know how to handle that state. It's not a problem for all the other cases
where buffering is necessary because it's either in the c2i adapter or C2 compiled code where we can
handle this already. I'll address this issue with a separate bug.

C1:
- c1_FrameMap.cpp: Account for the additional stack slot for storing the increment for stack
extension when computing the frame size.
- Need to re-execute flattened array load/store, defaultvalue, withfield and flattened putfield when
deoptimizing on buffering. To propagate this information to the debug info, I've added a
_should_reexecute field to ValueStack and IRScopeDebugInfo. We also need to keep track of the
state_before.

C2:
- callGenerator.cpp: When late inlining a method handle call that returns an inline type, we need to
allocate a buffer for the returned ValueTypeNode because the caller expects an oop return. Do this
before the method handle call in case the buffer allocation triggers deoptimization and we need to
(re-)execute the call in the interpreter.
- graphKit.cpp: Make sure the call is (re-)executed in the interpreter, if buffering of inline type
arguments triggers deoptimization. Also, make sure to re-execute a store to a non-flattened field if
buffering triggers deoptimization.
- library_call.cpp: The caller expects and oop when incrementally inlining an intrinsic that returns
an inline type. Make sure the call is re-executed if the allocation triggers a deoptimization. Also
make sure unsafe accesses that require buffering are re-executed if the allocation triggers
deoptimization.
- parse1.cpp: Returning an inline type might require buffering. Make sure we (re-execute) the return
if allocation triggers deoptimization. Also, make sure we allocate in the callee when returning from
an incrementally inlined method because the caller expects an oop.
- parse2.cpp: Re-execute flattened array load/store and acmp if buffering triggers deoptimization.
Also added missing decorators to store_flattened.
- parse3: Re-execute field store if buffering triggers deoptimization.
- parseHelper.cpp: Re-execute withfield if buffering triggers deoptimization.
- frame_x86.cpp: When extending the stack in the callee method entry to make room for unpacking of
inline type args, we keep a copy of the sender pc at the expected location in the callee frame. If
the sender pc is patched due to deoptimization (see frame::deoptimize -> frame::patch_pc), the copy
is not consistent anymore.

All tests now pass with -XX:-DeoptimizeALot -XX:-TieredCompilation.

Thanks,
Tobias