RFR: 8319801: Recursive lightweight locking: aarch64 implementation [v10]

Roman Kennke rkennke at openjdk.org
Thu Jan 25 13:16:38 UTC 2024


On Thu, 25 Jan 2024 11:50:48 GMT, Axel Boldt-Christmas <aboldtch at openjdk.org> wrote:

>> Implements the aarch64 port of JDK-8319796.
>> 
>> There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper.
>> 
>> The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully.
>> 
>> Only if the recursive lightweight [un]lock fails does it look at the mark word. 
>> 
>> For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime.
>> 
>> The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. 
>> 
>> First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. 
>> 
>> The aarch64 C2 port tries to avoid stronger memory semantics where ever possible. In C2 lock it first does a relaxed load of the mark word to check for inflation. Both lock and unlock uses a load/store exclusive register pair to transition the mark word.
>
> Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Drop memory order comments

I think it's good now. I only have a comment in aarch64.ad, please decide for yourself if you want to do anything about it. ;-)

src/hotspot/cpu/aarch64/aarch64.ad line 16477:

> 16475:   predicate(LockingMode == LM_LIGHTWEIGHT);
> 16476:   match(Set cr (FastLock object box));
> 16477:   effect(TEMP tmp, TEMP tmp2);

I believe you should declare box as TEMP here, because that is how it is used. It probably works by accident because box is only used by us and has no meaning outside of the locking code. Also, I think there is no need to match the box register to the 2nd FastLock input. If you change that, you might need to add a match_edge() in FastLockNode and FastUnlockNode in locknode.hpp that excludes the box argument when LW locking is on. I believe the purpuse of the box node/register is to track the stack-location of the stack-lock in stack-locking, and ensure that unlock is getting the same location there as the corresponding lock.
Thinking about this... this should be done as a follow-up or else it becomes too intrusive for this change.

-------------

Marked as reviewed by rkennke (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/16608#pullrequestreview-1843709973
PR Review Comment: https://git.openjdk.org/jdk/pull/16608#discussion_r1466356507


More information about the hotspot-dev mailing list