RFR: 8319947: Recursive lightweight locking: s390x implementation

Amit Kumar amitkumar at openjdk.org
Tue May 21 12:58:01 UTC 2024


On Tue, 21 May 2024 11:34:52 GMT, Axel Boldt-Christmas <aboldtch at openjdk.org> wrote:

>> s390x port for recursive locking. 
>> 
>> testing:
>> - [x] build fastdebug-vm
>> - [x] build slowdebug-vm
>> - [x] build release-vm
>> - [x] build optimized-vm
>> - [x] ./test/jdk/java/util/concurrent (fastdebug-vm)
>>   - [x] with C1
>>   - [x] with C2
>>   - [x] with interpreter 
>> - [x] ./test/jdk/java/util/concurrent (release-vm)
>>   - [x] with C1
>>   - [x] with C2
>>   - [x] with interpreter 
>> - [x] ./test/jdk/java/util/concurrent (slowdebug-vm)
>>   - [x] with C1
>>   - [x] with C2
>>   - [x] with interpreter
>> - [x] tier1 with fastdebug-vm 
>> - [x] tier1 with slowdebug-vm
>> - [x] tier1 with release-vm
>> 
>> *BenchMarks*:
>> 
>> Results from Performance LPARs : 
>> 
>> 
>> Locking Mode = 1 (without Patch)
>> 
>> Benchmark                                (innerCount)  Mode  Cnt     Score    Error  Units
>> LockUnlock.testContendedLock                      100  avgt   12     5.144 ±  0.035  ns/op
>> LockUnlock.testRecursiveLockUnlock                100  avgt   12  3824.742 ± 89.475  ns/op
>> LockUnlock.testRecursiveSynchronization           100  avgt   12    25.348 ±  0.559  ns/op
>> LockUnlock.testSerialLockUnlock                   100  avgt   12   466.629 ±  3.036  ns/op
>> LockUnlock.testSimpleLockUnlock                   100  avgt   12   468.532 ±  1.793  ns/op
>> Finished running test 'micro:vm.lang.LockUnlock'
>> 
>> Locking Mode = 1 (with patch)
>> 
>> Benchmark                                (innerCount)  Mode  Cnt     Score    Error  Units
>> LockUnlock.testContendedLock                      100  avgt   12     5.146 ±  0.027  ns/op
>> LockUnlock.testRecursiveLockUnlock                100  avgt   12  3833.175 ± 75.863  ns/op
>> LockUnlock.testRecursiveSynchronization           100  avgt   12    25.206 ±  0.519  ns/op
>> LockUnlock.testSerialLockUnlock                   100  avgt   12   473.973 ±  2.103  ns/op
>> LockUnlock.testSimpleLockUnlock                   100  avgt   12   470.749 ±  2.229  ns/op
>> Finished running test 'micro:vm.lang.LockUnlock'
>> 
>> 
>> 
>> 
>> Locking Mode = 2  (without Patch)
>> 
>> Benchmark                                (innerCount)  Mode  Cnt      Score    Error  Units
>> LockUnlock.testContendedLock                      100  avgt   12      4.688 ±  0.051  ns/op
>> LockUnlock.testRecursiveLockUnlock                100  avgt   12  12800.544 ± 92.265  ns/op
>> LockUnlock.testRecursiveSynchronization           100  avgt   12     26.486 ±  2.229  ns/op
>> LockUnlock.testSerialLockUnlock                   100  avgt   12    424.499 ±  0.416  ns/op
>> LockUnlock.te...
>
> src/hotspot/cpu/s390/macroAssembler_s390.cpp line 6015:
> 
>> 6013:     // we will encounter a loop while handling the inflated monitor case
>> 6014:     // so, we need to make sure, when we reach there only top one object is removed.
>> 6015:     // if we load top there then it could result into infinite loop, So preserving top is a Must here;
> 
> I assume this refers to the assert/debug code that checks that the lock stack does not contain the object when doing inflated unlocking. The debug code could just unconditionally reload top from the thread.

probably we can do something like this: 

diff --git a/src/hotspot/cpu/s390/macroAssembler_s390.cpp b/src/hotspot/cpu/s390/macroAssembler_s390.cpp
index 5a77e0d49f2..cfde21c84f0 100644
--- a/src/hotspot/cpu/s390/macroAssembler_s390.cpp
+++ b/src/hotspot/cpu/s390/macroAssembler_s390.cpp
@@ -5977,7 +5977,7 @@ void MacroAssembler::compiler_fast_unlock_lightweight_object(Register obj, Regis
   assert_different_registers(obj, tmp1, tmp2);
 
   // Handle inflated monitor.
-  NearLabel inflated, inflated_load_monitor;
+  NearLabel inflated, inflated_load_monitor, inflated_intermediate ;
   // Finish fast unlock successfully. MUST reach to with flag == EQ.
   NearLabel unlocked;
   // Finish fast unlock unsuccessfully. MUST branch to with flag == NE.
@@ -6021,7 +6021,7 @@ void MacroAssembler::compiler_fast_unlock_lightweight_object(Register obj, Regis
     // Check for monitor (0b10).
     z_lg(mark, Address(obj, oopDesc::mark_offset_in_bytes()));
     z_tmll(mark, markWord::monitor_value);
-    z_brnaz(inflated);
+    z_brnaz(inflated_intermediate);
 
 #ifdef ASSERT
     // Check header not unlocked (0b01).
@@ -6063,6 +6063,8 @@ void MacroAssembler::compiler_fast_unlock_lightweight_object(Register obj, Regis
     stop("Fast Unlock not monitor");
 #endif // ASSERT
 
+    bind(inflated_intermediate);
+    z_lgf(top, Address(Z_thread, JavaThread::lock_stack_top_offset()));
     bind(inflated);
 
 #ifdef ASSERT


But instead I kept the code a bit similar to other architectures and for future just added the comment as a warning to be careful which tweaking the code. I guess except this comment everything is fine ?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/18878#discussion_r1608279960


More information about the hotspot-dev mailing list