RFR: 8319799: Recursive lightweight locking: x86 implementation [v13]

Roman Kennke rkennke at openjdk.org
Thu Jan 25 13:28:38 UTC 2024


On Thu, 25 Jan 2024 09:16:43 GMT, Axel Boldt-Christmas <aboldtch at openjdk.org> wrote:

>> Implements the x86 port of JDK-8319796.
>> 
>> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper.
>> 
>> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully.
>> 
>> Only if the recursive lightweight [un]lock fails does it look at the mark word. 
>> 
>> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime.
>> 
>> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. 
>> 
>> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. 
>> 
>> The x86 C2 port also has some extra oddities. 
>> 
>> The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. 
>> 
>> The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime.
>> 
>> The contended unlock was also moved to the code stub.
>
> Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Update variable names in ad files
>  - Preload markWord unconditionally

A few (relatively minor) comments, still.

src/hotspot/cpu/x86/x86_32.ad line 13807:

> 13805:   predicate(LockingMode == LM_LIGHTWEIGHT);
> 13806:   match(Set cr (FastLock object box));
> 13807:   effect(TEMP eax_reg, TEMP tmp, USE_KILL box, TEMP thread);

Consider changing USE_KILL box to TEMP box. Same overall considerations (long-term, in a follow-up) as in aarch64.

src/hotspot/cpu/x86/x86_32.ad line 13820:

> 13818:   predicate(LockingMode == LM_LIGHTWEIGHT);
> 13819:   match(Set cr (FastUnlock object eax_reg));
> 13820:   effect(TEMP tmp, USE_KILL eax_reg, TEMP thread);

I think USE_KILL eax can also be changed to just TEMP, we're not really using an input here, right?

src/hotspot/cpu/x86/x86_64.ad line 12434:

> 12432:   predicate(LockingMode == LM_LIGHTWEIGHT);
> 12433:   match(Set cr (FastLock object box));
> 12434:   effect(TEMP rax_reg, TEMP tmp, USE_KILL box);

Same here.

src/hotspot/cpu/x86/x86_64.ad line 12446:

> 12444:   predicate(LockingMode == LM_LIGHTWEIGHT);
> 12445:   match(Set cr (FastUnlock object rax_reg));
> 12446:   effect(TEMP tmp, USE_KILL rax_reg);

And here.

-------------

Changes requested by rkennke (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/16607#pullrequestreview-1843738658
PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466374300
PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466375341
PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466375763
PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1466375954


More information about the hotspot-dev mailing list