RFR: 8297036: Generalize C2 stub mechanism

Thu Nov 24 15:40:23 UTC 2022

On Wed, 16 Nov 2022 15:03:07 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> Currently, we have two implementations of out-of-line stubs in C2, one for safepoint poll stubs (C2SafepointPollStubTable in output.hpp) and another for nmmethod entry barriers (C2EntryBarrierStubTable in output.hpp). I will need a few more for Lilliput: One for checking lock-stack size in method prologue, one for handling lock failures (both for fast-locking), and another one for load-klass slow-path. It would be good to generalize the mechanism and consolidate the existing uses on the new general mechanism.
> 
> Testing:
>  - [x] tier1 (x86_64, x86_32, aarch64)
>  - [x] tier2 (x86_64, x86_32, aarch64)
>  - [x] tier3 (x86_64, x86_32, aarch64)

Hi Xiaolin,

> This is a very nice refactoring, but we find this patch can interestingly break the RISC-V native fastdebug/slowdebug building process.
> 
> The root cause is the `measure_code_size()` is always a `0` value, for `PhaseOutput::init_buffer()` is too early to collect all CodeStubs emitted in the future. This issue appears to exist before this patch.
> 
> What is different is, originally `C2EntryBarrierStubTable::estimate_stub_size()` seems to return a non-zero value at all times for now (nmethod entry barriers for continuations). However, due to the pre-existing issue mentioned above, our new code using a sum to calculate the total size will always return a `0`, shrinking the `measure_code_size()` from a non-zero value to `0`. So the `stub_req` is smaller than before. Some code generated in RISC-V would just exceed the smaller threshold when emitting trampolines stubs, causing some failures.
> 
> The below print can show the zero value when `-XX:+UnlockDiagnosticVMOptions -XX:+UseNewCode` are on.
> 
> ```
> diff --git a/src/hotspot/share/opto/output.cpp b/src/hotspot/share/opto/output.cpp
> index f189a12316b..1416c196faf 100644
> --- a/src/hotspot/share/opto/output.cpp
> +++ b/src/hotspot/share/opto/output.cpp
> @@ -1244,7 +1244,11 @@ CodeBuffer* PhaseOutput::init_buffer() {
> 
>    BarrierSetC2* bs = BarrierSet::barrier_set()->barrier_set_c2();
>    stub_req += bs->estimate_stub_size();
> -  stub_req += _stub_list.measure_code_size();
> +  int newlist_size = _stub_list.measure_code_size();
> +  stub_req += newlist_size;
> +  if (UseNewCode) {
> +    tty->print_cr("the measured size is: %d", newlist_size);
> +  }
> 
>    // nmethod and CodeBuffer count stubs & constants as part of method's code.
>    // class HandlerImpl is platform-specific and defined in the *.ad files.
> ```
> 
> one hs_err file here. (the line number may not match, for I added some debugging stuff in my codebase) [hs_err_pid1282.log](https://github.com/openjdk/jdk/files/10085047/hs_err_pid1282.log)

I don't quite understand. Are you saying that in init_buffer(), we don't have any method_entry_barrier stubs, yet, and therefore we return 0 there? In this case I don't see how it could have worked before. Or are you saying that we have one method entry barrier stub, but measuring its size returns 0? In this case I don't understand why. The measurement creates a scratch buffer, and emits one stub to it, and measures the size of the generated code. C2MethodEntryBarrier::emit() doesn't have any relevant conditional in it, it should always emit something, and the size should be > 0. Or what am I missing?

Thank you,
Roman

-------------

PR: https://git.openjdk.org/jdk/pull/11188