RFR: 8319799: Recursive lightweight locking: x86 implementation
Roman Kennke
rkennke at openjdk.org
Fri Nov 10 15:05:01 UTC 2023
On Fri, 10 Nov 2023 12:59:27 GMT, Axel Boldt-Christmas <aboldtch at openjdk.org> wrote:
> Implements the x86 port of JDK-8319796.
>
> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper.
>
> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully.
>
> Only if the recursive lightweight [un]lock fails does it look at the mark word.
>
> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime.
>
> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided.
>
> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime.
>
> The x86 C2 port also has some extra oddities.
>
> The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware.
>
> The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime.
>
> The contended unlock was also moved to the code stub.
Very good stuff! I like how the C2 code-paths are re-shaped. I only have minor comments and a question.
src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp line 123:
> 121: __ movptr(Address(monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner)), _thread);
> 122:
> 123: // succsesor null check.
typo: succsesor -> successor
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 971:
> 969:
> 970: // Check if lock-stack is full.
> 971: cmpl(Address(thread, JavaThread::lock_stack_top_offset()), LockStack::end_offset() - 1);
I believe you can mov the movl(top, Address(thread, JavaThread::lock_stack_top_offset())) here, and use top in both checks.
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 979:
> 977: jccb(Assembler::equal, push);
> 978:
> 979: // Check for monitor (0b10).
It baffles me a little bit that we check for the monitor only after we checked for full-lock-stack and recursive locking. This means that if the object is monitor-locked, it has to wait for 3 loads (mark-word, top-of-stack-offset and top-of-stack) and two (pointless) test-and-branches. This seems to optimise the lw-locking case at the expense of monitor-locking case. I'm not sure that this is the right trade-off. You said in the description that this scales better? Can you elaborate on that?
src/hotspot/cpu/x86/macroAssembler_x86.cpp line 9817:
> 9815:
> 9816: // Check if the lock-stack is full.
> 9817: cmpl(Address(thread, JavaThread::lock_stack_top_offset()), LockStack::end_offset());
I believe you can mov the movl(top, Address(thread, JavaThread::lock_stack_top_offset())) here, and use top in both checks.
-------------
Changes requested by rkennke (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/16607#pullrequestreview-1724936988
PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1389503804
PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1389490845
PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1389500182
PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1389489016
More information about the hotspot-dev
mailing list