RFR: 8319901: Recursive lightweight locking: ppc64le implementation

Martin Doerr mdoerr at openjdk.org
Wed Feb 21 08:47:57 UTC 2024


On Fri, 10 Nov 2023 13:21:58 GMT, Axel Boldt-Christmas <aboldtch at openjdk.org> wrote:

> Draft implementation of recursive lightweight JDK-8319796 for ppc64le.
> 
> Porters feel free to discard this and implement something from scratch, take this initial draft and run with it, or send patches to me and if you want me to drive the PR.
> 
> Some notes:
> ppc64le deviates from the other ports, it shares it C2 locking code with the native call wrapper. This draft unifies the behavior across the ports by using the C1/interpreter locking for native call wrapper. If it is important to have a fast path for the inflated monitor case for native call wrapper, it could be done without having it share its implementation with C2. 
> It would also be possible to change this behavior on all ports (share the C2 code), as I believe the C2 assumptions always hold for the native call wrapper, the monitor exit and monitor enter will be balanced and structured.

> > Thanks a lot! LGTM.
> > Regarding aarch64: Should we file a JBS issue for the usage of `box`?
> > https://github.com/openjdk/jdk/blob/d31fd78d963d5d103b1b1bf66ae0bdbe4be2b790/src/hotspot/cpu/aarch64/aarch64.ad#L16470
> > 
> > has no kill effect for `box`, but `C2_MacroAssembler::fast_lock_lightweight` kills it. It may be never used, but it's a trap someone might run into. x86_64 uses `USE_KILL box`.
> > I think that it currently works, but I'd prefer to use a temp register and hopefully remove the `box` completely at some point of time. It could also be cleaned up together with other changes if you prefer that.
> 
> This came up in the review as well. The issue with `USE_KILL` and `KILL` is that they require a bound register to use. So we must restrict the box to a specific real register. (Works on x86 because it box is bound to `RAX`/`EAX`).
> 
> Should create a JBS issue for this. And maybe it is to dangerous to leave it as is until we can create a C2 Node that does not require the box.
> 
> The only potential fix I see is to actually bind the register. i.e.
> 
> ```diff
> diff --git a/src/hotspot/cpu/aarch64/aarch64.ad b/src/hotspot/cpu/aarch64/aarch64.ad
> index a07ff041c48..97d7898ceba 100644
> --- a/src/hotspot/cpu/aarch64/aarch64.ad
> +++ b/src/hotspot/cpu/aarch64/aarch64.ad
> @@ -16463,14 +16463,14 @@ instruct cmpFastUnlock(rFlagsReg cr, iRegP object, iRegP box, iRegPNoSp tmp, iRe
>    ins_pipe(pipe_serial);
>  %}
>  
> -instruct cmpFastLockLightweight(rFlagsReg cr, iRegP object, iRegP box, iRegPNoSp tmp, iRegPNoSp tmp2)
> +instruct cmpFastLockLightweight(rFlagsReg cr, iRegP object, iRegP_R0 box, iRegPNoSp tmp, iRegPNoSp tmp2)
>  %{
>    predicate(LockingMode == LM_LIGHTWEIGHT);
>    match(Set cr (FastLock object box));
> -  effect(TEMP tmp, TEMP tmp2);
> +  effect(TEMP tmp, TEMP tmp2, USE_KILL box);
>  
>    ins_cost(5 * INSN_COST);
> -  format %{ "fastlock $object,$box\t! kills $tmp,$tmp2" %}
> +  format %{ "fastlock $object,$box\t! kills $tmp,$tmp2,$box" %}
>  
>    ins_encode %{
>      __ fast_lock_lightweight($object$$Register, $box$$Register, $tmp$$Register, $tmp2$$Register);
> @@ -16479,14 +16479,14 @@ instruct cmpFastLockLightweight(rFlagsReg cr, iRegP object, iRegP box, iRegPNoSp
>    ins_pipe(pipe_serial);
>  %}
>  
> -instruct cmpFastUnlockLightweight(rFlagsReg cr, iRegP object, iRegP box, iRegPNoSp tmp, iRegPNoSp tmp2)
> +instruct cmpFastUnlockLightweight(rFlagsReg cr, iRegP object, iRegP_R0 box, iRegPNoSp tmp, iRegPNoSp tmp2)
>  %{
>    predicate(LockingMode == LM_LIGHTWEIGHT);
>    match(Set cr (FastUnlock object box));
> -  effect(TEMP tmp, TEMP tmp2);
> +  effect(TEMP tmp, TEMP tmp2, USE_KILL box);
>  
>    ins_cost(5 * INSN_COST);
> -  format %{ "fastunlock $object,$box\t! kills $tmp, $tmp2" %}
> +  format %{ "fastunlock $object,$box\t! kills $tmp, $tmp2, $box" %}
>  
>    ins_encode %{
>      __ fast_unlock_lightweight($object$$Register, $box$$Register, $tmp$$Register, $tmp2$$Register);
> ```
> 
> Maybe someone with better knowledge of the ADLC knows a better solution.

That is a known limitation. I've filed https://bugs.openjdk.org/browse/JDK-8326385. We should better continue discussions there.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16611#issuecomment-1956152214


More information about the hotspot-dev mailing list