RFR: 8210832: Remove sneaky locking in class Monitor

Fri Feb 1 18:01:03 UTC 2019

Hi Dan,

On 2/1/19 11:10 AM, Daniel D. Daugherty wrote:
> > Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v04/inc/
>
> src/hotspot/share/logging/logTag.hpp
>     No comments.
>
> src/hotspot/share/runtime/interfaceSupport.inline.hpp
>     L295: // - When transitioning in (constructor), it checks for 
> safepoints without having to block, i.e calls
>         Perhaps:
>           // - When transitioning in (constructor), it checks for 
> safepoints without blocking, i.e., calls
>
>     L298: //   the monitor that is in the process of being acquired 
> (only setting the owner is missing).
>         Perhaps:
>           //   the monitor that is only partially acquired.
Done.
> src/hotspot/share/runtime/mutex.cpp
>     L59: log_trace(vmmonitor)("JavaThread " INTPTR_FORMAT " on %d 
> attempt trying to acquire native monitor %s", p2i(self), retry_cnt, 
> _name);
>         nit - s/native monitor/vmmonitor/  - for consistency.
Done.

> src/hotspot/share/runtime/mutex.hpp
>     No comments.
>
> src/hotspot/share/runtime/safepointMechanism.inline.hpp
>     L83:   if(!uses_thread_local_poll() || local_poll_armed(thread)) {
>     L84:     if (global_poll()) {
>         I'm having trouble with this construct:
>
>         - if uses_thread_local_poll() is false, then we check 
> global_poll()
>           and call SafepointSynchronize::block() if global_poll() == 
> true.
>           That makes sense since we're not using local polling.
>
>         - if uses_thread_local_poll() is true, then we check
>           if local_poll_armed(thread) == true and then we check 
> global_poll()
>           and call SafepointSynchronize::block() if global_poll() == 
> true.
>
>           Huh?
>
>         So we only call SafepointSynchronize::block() if global_poll() 
> == true
>         no matter what. It seems like the checks for local polling 
> don't really
>         do anything here.
>
>         What am I missing?
A couple of observations before going straight to the question. The 
check for local poll in SafepointMechanism::should_block() (TBIVMWDP 
destructor) is needed because we need to check for both, pending 
safepoints or pending handshakes. In other words once we transition back 
from the _thread_blocked state, there could be either a safepoint or a 
handshake for that thread being processed. If 
SafepointMechanism::should_block() returns true, then 
SafepointMechanism::callback_if_safepoint() only checks for pending 
safepoints. After that I do the check "if (_thread->has_handshake())" 
for pending handshakes. I couldn't use 
SafepointMechanism::block_if_requested() which checks for both like in 
the normal TBIVM, because of two reasons:
- On one hand, I need to pass the "false" parameter to 
SafepointSynchronize::block() to avoid blocking when transitioning in, 
and avoid failing the test that we have a ThreadSafepointState of 
_call_back when transitioing back. Without introducing 
SafepointMechanism::callback_if_safepoint() I would have to modify a 
couple of methods in SafepointMechanism to receive that extra argument. 
I think this was cleaner.
-But even if I did add the extra argument to those functions and use 
SafepointMechanism::block_if_requested(), I would need also to avoid 
checking for handshakes in the TBIVMWDP constructor. The processing of 
handshakes uses its own ThreadInVMForHandshake jacket which internally 
calls SafepointMechanism::block_if_requested(). Since in the constructor 
we do not block for safepoints, executing again 
SafepointMechanism::block_if_requested() in a handshake would execute 
SS::block twice.
So to solve that I just introduced 
SafepointMechanism::callback_if_safepoint() which only checks for 
safepoints and uses the extra argument in SS::block.

Now, as to why I can't just check for global polls in 
SafepointMechanism::callback_if_safepoint() (since I only want to 
check/stop for safepoints), this has to do with the race found by 
Robbin. Since the thread local poll is armed after changing _state to 
_synchronizing, if I only do a global poll, 
SafepointMechanism::callback_if_safepoint(in the TBIVMWDP constructor) 
could detect there is a safepoint in progress and callback but when 
coming back in the destructor SafepointMechanism::should_block() could 
miss detecting the safepoint in progress since that method checks for 
local polls first (it cares about both safepoints and handshakes). So I 
need to match the order of checks done by 
SafepointMechanism::should_block().

Any of this makes sense?

Thanks,
Patricio

> Dan
>
> On 2/1/19 12:05 AM, Patricio Chilano wrote:
>> Hi David,
>>
>> On 1/31/19 12:54 AM, David Holmes wrote:
>>> On 31/01/2019 7:37 am, Patricio Chilano wrote:
>>>> Hi David,
>>>>
>>>> On 1/30/19 2:29 AM, David Holmes wrote:
>>>>> Hi Patricio,
>>>>>
>>>>> <trimming>
>>>>>
>>>>> First, thanks for all the many weeks of work you've put into this, 
>>>>> pulling together a number of ideas from different people to make 
>>>>> it all work!
>>>> Thanks! Credit to you for the PlatformMonitor implementation : )
>>>
>>> :) Nothing innovative there.
>>>
>>>>> I've only got a few minor comments/suggestions.
>>>>>
>>>>> On 30/01/2019 10:24 am, Patricio Chilano wrote:
>>>>>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/webrev/
>>>>>> Inc: 
>>>>>> http://cr.openjdk.java.net/~pchilanomate/8210832/v03/inc/webrev/
>>>>>>
>>>>>
>>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp
>>>>>
>>>>> I'm very unclear how ThreadLockBlockInVM differs from 
>>>>> ThreadBlockInVM. You've duplicated a lot of complex code which is 
>>>>> masking the actual difference between the two wrappers to me. It 
>>>>> seems to me that an extra arg to transition_and_fence should allow 
>>>>> you to handle the new behaviour without having to duplicate so 
>>>>> much of this code. In any case the semantics of 
>>>>> ThreadLockBlockInVM needs to be described.
>>>> I could do it with one extra argument, but I would need to add two 
>>>> extra branches in transition_and_fence(), one to decide if I'm in 
>>>> the Monitor case to avoid calling 
>>>> SafepointMechanism::block_if_requested() directly and another one 
>>>> to actually decide if I'm transitioning in or out, since the 
>>>> actions to perform are different. I think it is easier to read 
>>>> without adding new conditionals, and also we will save those extra 
>>>> branches, but if you think it's better this way I can change it.
>>>
>>> I would like something that tells me more clearly how this new 
>>> transition helper differs from the existing TBIVM. Sharing the code 
>>> between them and using different args would be one way. Documenting 
>>> the difference in comments would be another. Your choice.
>> Ok, I added a description on top of TLBIVM.
>>
>>>>> Also I'm unclear what the "Lock" in ThreadLockBlockInVM actually 
>>>>> refers to. I find the name quite jarring to read.
>>>> What about changing it to ThreadBlockinMonitor?
>>>
>>> That's not quite conveying the semantics. The problem is that the 
>>> semantics we are changing compared to TBIVM are not evident in the 
>>> TBIVM name. If TBIVM was actually 
>>> ThreadBlockInVMWithSafepointBlocking, then this new transition would 
>>> obviously be ThreadBlockInVMWithoutSafepointBlocking - but perhaps 
>>> that lengthy, but clear name would be okay anyway?
>> Not convinced on that name since we are blocking at safepoints in the 
>> destructor. Based on the comments I added how about 
>> ThreadBlockinVMWithDeadlockPrevention ? or 
>> ThreadBlockinVMWithPreemption? (as in eliminate one of the conditions 
>> for deadlocks).
>>
>>>>> On the subject of naming, do_preempt and preempt_by_safepoint 
>>>>> don't really convey to me what happens - what is being "preempted" 
>>>>> here? I would suggest a more direct Monitor::release_for_safepoint
>>>> Changed.
>>>>
>>>>> ---
>>>>>
>>>>> Logging: why "nativemonitor"? The logging in mutex.cpp doesn't 
>>>>> relate to a "native" monitor?? Actually I'm not even sure if we 
>>>>> need bother at all with the one logging statement that is present.
>>>> I added it to eventually track unbounded try locks. Not sure I 
>>>> follow you with the name, isn't that how we name this monitors? I 
>>>> tried to differentiate them from Java monitors. What about just 
>>>> "monitor"?
>>>
>>> How about vmmonitor ?
>> Ok, changed.
>>
>>>>> ---
>>>>>
>>>>> src/hotspot/share/runtime/mutex.cpp
>>>>>
>>>>> void Monitor::lock_without_safepoint_check(Thread * self) {
>>>>>   // Ensure that the Monitor does not require or allow safepoint 
>>>>> checks.
>>>>>
>>>>> The comment there should only say "not require".
>>>> Done.
>>>>
>>>>> void Monitor::preempt_by_safepoint() {
>>>>>   _lock.unlock();
>>>>> }
>>>>>
>>>>> Apart from renaming this as suggested above, aren't there any 
>>>>> suitable assertions we should have here? safepoint-in-progress or 
>>>>> handshake-in-progress? _owner == Thread::current?
>>>> Ok, I added an assertion that owner should be NULL. Asserting 
>>>> safepoint-in-progress does not really work because _state could 
>>>> change to _not_synchronized right after you checked for it in TLBIVM.
>>>
>>> Okay.
>>>
>>>>> Nit:
>>>>>
>>>>> assert(_owner == Thread::current(), "should be equal: owner=" 
>>>>> INTPTR_FORMAT
>>>>>                    ", self=" INTPTR_FORMAT, p2i(_owner), 
>>>>> p2i(Thread::current()));
>>>>>
>>>>> with Dan's enhanced assertions there's an indentation issue. The 
>>>>> second line should indent to the first comma, but that will make 
>>>>> the second line extend way past 80 columns.
>>>>>
>>>>> Also you could factor that assertion for _owner==Thread::current() 
>>>>> into its own function or macro to avoid the repetition.
>>>> Corrected indentation based on Dan's reply to align with _owner.
>>>
>>> I though it should indent to the comma because it is a continuation 
>>> of the same argument being passed to the assert "function". But I'm 
>>> okay with Dan's suggestion.
>>>
>>> Factoring it into its own little function or macro would still be 
>>> good to avoid the repetition.
>> Ok, added new function assert_owned_by_self(). I could change it to 
>> assert_owner(Thread*) and use it for the NULL case too unifying the 
>> printed messages to something like "invalid owner: owner=" 
>> INTPTR_FORMAT ", should be:" INTPTR_FORMAT. What do you think?
>>
>>>>>  OSThreadWaitState osts(self->osthread(), false /* not 
>>>>> Object.wait() */);
>>>>>
>>>>> This needs to be returned to its original place as per Dan's 
>>>>> comments.
>>>> Done.
>>>>
>>>>>     } else {
>>>>>       Monitor::lock(self);
>>>>>     }
>>>>>
>>>>> You don't need Monitor:: here
>>>> Removed.
>>>>
>>>>> // Temporary JVM_RawMonitor* support. A raw monitor can just be a 
>>>>> PlatformMonitor now.
>>>>>
>>>>> This needs to be resolved before committing. Some of the existing 
>>>>> commentary on what raw monitors are needs to be retained. Not 
>>>>> clear if we need to set the _owner field or can just skip it.
>>>> Is it okay if I keep the following comments?
>>>>
>>>> // Yet another degenerate version of Monitor::lock() or 
>>>> lock_without_safepoint_check()
>>>> // jvm_raw_lock() and _unlock() can be called by non-Java threads 
>>>> via JVM_RawMonitorEnter.
>>>> //
>>>> // There's no expectation that JVM_RawMonitors will interoperate 
>>>> properly with the native
>>>> // Mutex-Monitor constructs.  We happen to implement 
>>>> JVM_RawMonitors in terms of
>>>> // native Mutex-Monitors simply as a matter of convenience.
>>>
>>> Yep that's perfect. And as a future RFE we can replace them with 
>>> direct use of PlatformMonitor (or PlatformMutex).
>>>
>>>>
>>>> I could keep setting the owner as _owner = 
>>>> Thread::current_or_null() in jvm_raw_lock(), at least it wouldn't 
>>>> hurt.
>>>
>>> It's useful for checking usage errors, but we won't have that if we 
>>> replace with PlatformMonitor, so may as well drop it now IMO.
>> Ok, I added asserts that _owner should be NULL after acquiring it and 
>> before releasing it though.
>>
>>>>> Monitor::~Monitor() {
>>>>>   assert(_owner == NULL, "should be NULL: owner=" INTPTR_FORMAT, 
>>>>> p2i(_owner));
>>>>> }
>>>>>
>>>>> Will this automatically result in the PlatformMonitor destructor 
>>>>> being called?
>>>> Yes, should I add a comment to make it clear that 
>>>> ~PlatformMonitor() will be executed?
>>>
>>> No need - assume other people have a better understanding of C++ 
>>> than I do :)
>>
>> Below is version v04, which also contains a correction pointed out 
>> off-list by Robbin to do a local poll first in 
>> SafepointMechanism::callback_if_safepoint() if we are using local 
>> polls. Since the thread local poll is armed after changing _state to 
>> _synchronizing, if we only do a global poll in the TLBIVM constructor 
>> we could detect there is a safepoint in progress and callback but 
>> when coming back in the destructor SafepointMechanism::should_block() 
>> could miss detecting the safepoint in progress since that method 
>> checks first for local polls.
>>
>>
>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v04/
>> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v04/inc/
>>
>>
>> Thanks,
>> Patricio
>>
>>> Thanks,
>>> David
>>>
>>>>
>>>>
>>>> Thanks for looking into this! Waiting on your comments to send v04.
>>>>
>>>> Thanks,
>>>> Patricio
>>>>> ---
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>
>>
>