RFR: 8319799: Recursive lightweight locking: x86 implementation [v9]

Coleen Phillimore coleenp at openjdk.org
Tue Jan 23 13:56:30 UTC 2024


On Mon, 22 Jan 2024 15:56:45 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> I ran through tier1-tier7 with the even stronger invariant that the owner is the thread. 
>> The current comment (in the code) is outdated. 
>> 
>> I plan to push a fixed version, just wanted to figure out what makes the most sense. 
>> Given that the monitor check is elided now, the fixing the owner field should be removed. 
>> Alternatively let the slow path check for monitor after the CAS failed, and jump back to the
>> inflated case. In this case the comment and the fixing of the owner field would be important. 
>> 
>> It is not immediately obvious to me that the alternative is worth it. Because we removed hashcode causing inflation the only point where this code would avoid going into the runtime by checking for inflated after a CAS fail is if another thread is in entering on the object monitor in parallel and we can do a direct handoff. Otherwise it would be on the entry / cxq queue and going into the runtime is required. 
>> The pragmatic solution would be to remove the store and add the assert for this PR. And create an RFE to evaluate this. 
>> 
>> Ran through with this code.
>> ```c++
>> #ifdef ASSERT
>>     Label correct_owner;
>>     __ movptr(_t, Address(monitor, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner)));
>>     __ cmpptr(_t, _thread);
>>     __ jccb(Assembler::equal, correct_owner);
>>     __ testptr(_t, Address(_t)); // Crash on bad address
>>     __ stop("Bad owner");
>>     __ bind (correct_owner);
>> #endif
>> 
>> 
>> I will rebase and push the changes tomorrow.
>
> I like how with your patch all the knowledge about the anonymous owner is in synchronizer.cpp, which is why this comment and code would be nicer to be absent.

Ok now I see.  We've popped off the only entry of the lock stack (not recursive), but if some other thread has inflated the monitor due to contention, we'll fail this CAS.


    // Try to unlock. Transition lock bits 0b00 => 0b01
    movptr(reg_rax, mark);
    andptr(reg_rax, ~(int32_t)markWord::lock_mask);
    orptr(mark, markWord::unlocked_value);
    lock(); cmpxchgptr(mark, Address(obj, oopDesc::mark_offset_in_bytes()));
    jcc(Assembler::notEqual, push_and_slow_path);
    jmp(unlocked);


The alternative you're suggesting is to instead of pushing back on the lock-stack, go to check_successor (which will be non-null due to the contention) and the handoff will be faster that way.  Then we have to restore the owner thread. Is the lock definitely never inflated due to hash code in this case now?

Let's explain this in another RFE and we can see if benchmark results support this optimization.  Pushing back on the lock stack and going slow path is simpler (less assembly code) and matches the interpreter/c1 case. But let's see after this is checked in. Your assert with some adjustment to the comment looks good.  I was trying to remove the concept of anonymous owner from the assembly code because of another change I was working on.  Thanks.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16607#discussion_r1463316791


More information about the hotspot-dev mailing list