RFR: 8319947: Recursive lightweight locking: s390x implementation
Amit Kumar
amitkumar at openjdk.org
Tue May 21 12:58:01 UTC 2024
On Tue, 21 May 2024 11:34:52 GMT, Axel Boldt-Christmas <aboldtch at openjdk.org> wrote:
>> s390x port for recursive locking.
>>
>> testing:
>> - [x] build fastdebug-vm
>> - [x] build slowdebug-vm
>> - [x] build release-vm
>> - [x] build optimized-vm
>> - [x] ./test/jdk/java/util/concurrent (fastdebug-vm)
>> - [x] with C1
>> - [x] with C2
>> - [x] with interpreter
>> - [x] ./test/jdk/java/util/concurrent (release-vm)
>> - [x] with C1
>> - [x] with C2
>> - [x] with interpreter
>> - [x] ./test/jdk/java/util/concurrent (slowdebug-vm)
>> - [x] with C1
>> - [x] with C2
>> - [x] with interpreter
>> - [x] tier1 with fastdebug-vm
>> - [x] tier1 with slowdebug-vm
>> - [x] tier1 with release-vm
>>
>> *BenchMarks*:
>>
>> Results from Performance LPARs :
>>
>>
>> Locking Mode = 1 (without Patch)
>>
>> Benchmark (innerCount) Mode Cnt Score Error Units
>> LockUnlock.testContendedLock 100 avgt 12 5.144 ± 0.035 ns/op
>> LockUnlock.testRecursiveLockUnlock 100 avgt 12 3824.742 ± 89.475 ns/op
>> LockUnlock.testRecursiveSynchronization 100 avgt 12 25.348 ± 0.559 ns/op
>> LockUnlock.testSerialLockUnlock 100 avgt 12 466.629 ± 3.036 ns/op
>> LockUnlock.testSimpleLockUnlock 100 avgt 12 468.532 ± 1.793 ns/op
>> Finished running test 'micro:vm.lang.LockUnlock'
>>
>> Locking Mode = 1 (with patch)
>>
>> Benchmark (innerCount) Mode Cnt Score Error Units
>> LockUnlock.testContendedLock 100 avgt 12 5.146 ± 0.027 ns/op
>> LockUnlock.testRecursiveLockUnlock 100 avgt 12 3833.175 ± 75.863 ns/op
>> LockUnlock.testRecursiveSynchronization 100 avgt 12 25.206 ± 0.519 ns/op
>> LockUnlock.testSerialLockUnlock 100 avgt 12 473.973 ± 2.103 ns/op
>> LockUnlock.testSimpleLockUnlock 100 avgt 12 470.749 ± 2.229 ns/op
>> Finished running test 'micro:vm.lang.LockUnlock'
>>
>>
>>
>>
>> Locking Mode = 2 (without Patch)
>>
>> Benchmark (innerCount) Mode Cnt Score Error Units
>> LockUnlock.testContendedLock 100 avgt 12 4.688 ± 0.051 ns/op
>> LockUnlock.testRecursiveLockUnlock 100 avgt 12 12800.544 ± 92.265 ns/op
>> LockUnlock.testRecursiveSynchronization 100 avgt 12 26.486 ± 2.229 ns/op
>> LockUnlock.testSerialLockUnlock 100 avgt 12 424.499 ± 0.416 ns/op
>> LockUnlock.te...
>
> src/hotspot/cpu/s390/macroAssembler_s390.cpp line 6015:
>
>> 6013: // we will encounter a loop while handling the inflated monitor case
>> 6014: // so, we need to make sure, when we reach there only top one object is removed.
>> 6015: // if we load top there then it could result into infinite loop, So preserving top is a Must here;
>
> I assume this refers to the assert/debug code that checks that the lock stack does not contain the object when doing inflated unlocking. The debug code could just unconditionally reload top from the thread.
probably we can do something like this:
diff --git a/src/hotspot/cpu/s390/macroAssembler_s390.cpp b/src/hotspot/cpu/s390/macroAssembler_s390.cpp
index 5a77e0d49f2..cfde21c84f0 100644
--- a/src/hotspot/cpu/s390/macroAssembler_s390.cpp
+++ b/src/hotspot/cpu/s390/macroAssembler_s390.cpp
@@ -5977,7 +5977,7 @@ void MacroAssembler::compiler_fast_unlock_lightweight_object(Register obj, Regis
assert_different_registers(obj, tmp1, tmp2);
// Handle inflated monitor.
- NearLabel inflated, inflated_load_monitor;
+ NearLabel inflated, inflated_load_monitor, inflated_intermediate ;
// Finish fast unlock successfully. MUST reach to with flag == EQ.
NearLabel unlocked;
// Finish fast unlock unsuccessfully. MUST branch to with flag == NE.
@@ -6021,7 +6021,7 @@ void MacroAssembler::compiler_fast_unlock_lightweight_object(Register obj, Regis
// Check for monitor (0b10).
z_lg(mark, Address(obj, oopDesc::mark_offset_in_bytes()));
z_tmll(mark, markWord::monitor_value);
- z_brnaz(inflated);
+ z_brnaz(inflated_intermediate);
#ifdef ASSERT
// Check header not unlocked (0b01).
@@ -6063,6 +6063,8 @@ void MacroAssembler::compiler_fast_unlock_lightweight_object(Register obj, Regis
stop("Fast Unlock not monitor");
#endif // ASSERT
+ bind(inflated_intermediate);
+ z_lgf(top, Address(Z_thread, JavaThread::lock_stack_top_offset()));
bind(inflated);
#ifdef ASSERT
But instead I kept the code a bit similar to other architectures and for future just added the comment as a warning to be careful which tweaking the code. I guess except this comment everything is fine ?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/18878#discussion_r1608279960
More information about the hotspot-dev
mailing list