RFR: 8210832: Remove sneaky locking in class Monitor

Fri Feb 1 22:55:36 UTC 2019

On 2/1/19 3:20 PM, Patricio Chilano wrote:
> Hi Dan,
>
>
> On 2/1/19 2:22 PM, Daniel D. Daugherty wrote:
>> On 2/1/19 1:01 PM, Patricio Chilano wrote:
>>> Hi Dan,
>>>
>>> On 2/1/19 11:10 AM, Daniel D. Daugherty wrote:
>>>> > Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v04/inc/
>>>>
>>>> src/hotspot/share/logging/logTag.hpp
>>>>     No comments.
>>>>
>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp
>>>>     L295: // - When transitioning in (constructor), it checks for 
>>>> safepoints without having to block, i.e calls
>>>>         Perhaps:
>>>>           // - When transitioning in (constructor), it checks for 
>>>> safepoints without blocking, i.e., calls
>>>>
>>>>     L298: //   the monitor that is in the process of being acquired 
>>>> (only setting the owner is missing).
>>>>         Perhaps:
>>>>           //   the monitor that is only partially acquired.
>>> Done.
>>>> src/hotspot/share/runtime/mutex.cpp
>>>>     L59: log_trace(vmmonitor)("JavaThread " INTPTR_FORMAT " on %d 
>>>> attempt trying to acquire native monitor %s", p2i(self), retry_cnt, 
>>>> _name);
>>>>         nit - s/native monitor/vmmonitor/  - for consistency.
>>> Done.
>>>
>>>> src/hotspot/share/runtime/mutex.hpp
>>>>     No comments.
>>>>
>>>> src/hotspot/share/runtime/safepointMechanism.inline.hpp
>>>>     L83:   if(!uses_thread_local_poll() || local_poll_armed(thread)) {
>>>>     L84:     if (global_poll()) {
>>>>         I'm having trouble with this construct:
>>>>
>>>>         - if uses_thread_local_poll() is false, then we check 
>>>> global_poll()
>>>>           and call SafepointSynchronize::block() if global_poll() 
>>>> == true.
>>>>           That makes sense since we're not using local polling.
>>>>
>>>>         - if uses_thread_local_poll() is true, then we check
>>>>           if local_poll_armed(thread) == true and then we check 
>>>> global_poll()
>>>>           and call SafepointSynchronize::block() if global_poll() 
>>>> == true.
>>>>
>>>>           Huh?
>>>>
>>>>         So we only call SafepointSynchronize::block() if 
>>>> global_poll() == true
>>>>         no matter what. It seems like the checks for local polling 
>>>> don't really
>>>>         do anything here.
>>>>
>>>>         What am I missing?
>>> A couple of observations before going straight to the question. The 
>>> check for local poll in SafepointMechanism::should_block() (TBIVMWDP 
>>> destructor) is needed because we need to check for both, pending 
>>> safepoints or pending handshakes. In other words once we transition 
>>> back from the _thread_blocked state, there could be either a 
>>> safepoint or a handshake for that thread being processed. If 
>>> SafepointMechanism::should_block() returns true, then 
>>> SafepointMechanism::callback_if_safepoint() only checks for pending 
>>> safepoints. After that I do the check "if 
>>> (_thread->has_handshake())" for pending handshakes. I couldn't use 
>>> SafepointMechanism::block_if_requested() which checks for both like 
>>> in the normal TBIVM, because of two reasons:
>>> - On one hand, I need to pass the "false" parameter to 
>>> SafepointSynchronize::block() to avoid blocking when transitioning 
>>> in, and avoid failing the test that we have a ThreadSafepointState 
>>> of _call_back when transitioing back. Without introducing 
>>> SafepointMechanism::callback_if_safepoint() I would have to modify a 
>>> couple of methods in SafepointMechanism to receive that extra 
>>> argument. I think this was cleaner.
>>> -But even if I did add the extra argument to those functions and use 
>>> SafepointMechanism::block_if_requested(), I would need also to avoid 
>>> checking for handshakes in the TBIVMWDP constructor. The processing 
>>> of handshakes uses its own ThreadInVMForHandshake jacket which 
>>> internally calls SafepointMechanism::block_if_requested(). Since in 
>>> the constructor we do not block for safepoints, executing again 
>>> SafepointMechanism::block_if_requested() in a handshake would 
>>> execute SS::block twice.
>>> So to solve that I just introduced 
>>> SafepointMechanism::callback_if_safepoint() which only checks for 
>>> safepoints and uses the extra argument in SS::block.
>>>
>>> Now, as to why I can't just check for global polls in 
>>> SafepointMechanism::callback_if_safepoint() (since I only want to 
>>> check/stop for safepoints), this has to do with the race found by 
>>> Robbin. Since the thread local poll is armed after changing _state 
>>> to _synchronizing, if I only do a global poll, 
>>> SafepointMechanism::callback_if_safepoint(in the TBIVMWDP 
>>> constructor) could detect there is a safepoint in progress and 
>>> callback but when coming back in the destructor 
>>> SafepointMechanism::should_block() could miss detecting the 
>>> safepoint in progress since that method checks for local polls first 
>>> (it cares about both safepoints and handshakes). So I need to match 
>>> the order of checks done by SafepointMechanism::should_block().
>>>
>>> Any of this makes sense?
>>
>> Not really, but now I have a better way to ask about this.
>>
>> Here's should_block():
>>
>>   50 bool SafepointMechanism::should_block(Thread* thread) {
>>   51   if (uses_thread_local_poll()) {
>>   52     return local_poll(thread);
>>   53   } else {
>>   54     return global_poll();
>>   55   }
>>   56 }
>>
>> If uses_thread_local_poll() == true, then we return the result
>> of local_poll(thread) and skip calling global_poll().
>>
>> Here's callback_if_safepoint():
>>
>>   82 void SafepointMechanism::callback_if_safepoint(JavaThread* 
>> thread) {
>>   83   if(!uses_thread_local_poll() || local_poll_armed(thread)) {
>>   84     if (global_poll()) {
>>   85       SafepointSynchronize::block(thread, false);
>>   86     }
>>   87   }
>>   88 }
>>
>> If uses_thread_local_poll() == false and global_poll() == true,
>> then we call block().
>> If uses_thread_local_poll() == true and local_poll_armed() == true
>> and global_poll() == true, then we call block().
>>
>> Since we always check global_poll() == true before calling block(),
>> then preceding expressions are moot. They do not affect whether we
>> call block() at all.
> It's true that those checks for local poll are not actually needed if 
> you get to execute SafepointMechanism::callback_if_safepoint() in the 
> TBIVMWDP destructor. At that point I could just check for 
> global_poll(). But because I'm previously executing 
> SafepointMechanism::should_block() to first decide whether or not I 
> need to block for a safepoint or handshake, that forces me to add the 
> local polls checks when executing 
> SafepointMechanism::callback_if_safepoint() in the TBIVMWDP 
> constructor. Otherwise let's say I removed them:
>
> 1- VMThread in SafepointSynchronize::begin() executes  "_state = 
> _synchronizing;" and gets context switched by the OS.
> 2 - JavaThread executes SafepointMechanism::callback_if_safepoint() in 
> TBIVMWDP constructor, checks global_poll() which returns true, and 
> callsback for safepoint without blocking.
> 3- JavaThread goes into the _thread_blocked state and blocks, let's 
> say in "_lock.lock();" in Monitor::lock().
> 4- JavaThread acquires the lock and transitions back. In the TBIVMWDP 
> destructor calls SafepointMechanism::should_block(). If we are using 
> local polls, should_block() will check local_poll_armed(). That will 
> return false since the VMThread didn't armed the local poll yet.
> 5- JavaThread continues to execute in VM.
> 6- VMThread is scheduled again, and arms local polls. At this point 
> you can have serveral problems, either the JavaThread could try to 
> callback again(i.e twice) at some point later, or not and the 
> safepoint will start while that JavaThread is executing in the VM.

Making sure I understand why you think you need L83:

At step #2, if you put this check back:

   L83   if(!uses_thread_local_poll() || local_poll_armed(thread)) {

then the change in behavior is that you don't call

   L84     if (global_poll()) {
   L85       SafepointSynchronize::block(thread, false);

because the VMThread hasn't armed local polls yet right? So L83
keeps the target thread from calling block() too early? And then in
step 4, in should_block(), uses_thread_local_poll() will return
false because the local poll still isn't armed yet and the target
will call global_poll() which will return true because of _state
== _synchronizing. The caller of should_block() in the TBIVMWDP
destructor realizes that we need to block for a safepoint so it
will release the lock that the VMThread _might_ want and block()
for the safepoint.

Okay, I think I have understood what you're arguing.

Here's a minor correction that I see in the above:

> 4- JavaThread acquires the lock and transitions back. In the TBIVMWDP 
> destructor calls SafepointMechanism::should_block(). If we are using 
> local polls, should_block() will check local_poll_armed(). That will 
> return false since the VMThread didn't armed the local poll yet. 

In Step 4, you're missing that should_block() will call
global_poll() and global_poll() is:

   37 bool SafepointMechanism::global_poll() {
   38   return SafepointSynchronize::do_call_back();
   39 }

which calls:

  141   inline static bool do_call_back() {
  142     return (_state != _not_synchronized);
  143   }

and the _state == _synchronizing still so should_block()
will return true. The TBIVMWDP destructor realizes that
we need to block for a safepoint so it will release the
lock that the VMThread _might_ want and block() for the
safepoint. So we do get the second block() call for the
same safepoint from the save thread.

So for this function, consider adding this comment:

   L82: void SafepointMechanism::callback_if_safepoint(JavaThread* thread) {
   L83:   if (!uses_thread_local_poll() || local_poll_armed(thread)) {
            // If using thread local polls, then we should not check the
            // global_poll() and callback via block() yet if the VMThread
            // has not armed the local poll yet. Otherwise, we'll callback
            // twice for the same thread.
   L84:     if (global_poll()) {
   L85:       SafepointSynchronize::block(thread, false);
   L86:     }
   L87:   }
   L88: }

Maybe I'm the only one who isn't grok'ing this code. Dunno. The above
comment clears it up for me, but others may say "That's obvious!"

Dan

>
> I could add another argument in 
> SafepointMechanism::callback_if_safepoint() to skip checking local 
> polls in the destructor if that's what you mean.
>
>
> Patricio
>
>> Dan
>>
>>
>>>
>>>
>>> Thanks,
>>> Patricio
>>>
>>>> Dan
>>>>
>>>> On 2/1/19 12:05 AM, Patricio Chilano wrote:
>>>>> Hi David,
>>>>>
>>>>> On 1/31/19 12:54 AM, David Holmes wrote:
>>>>>> On 31/01/2019 7:37 am, Patricio Chilano wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> On 1/30/19 2:29 AM, David Holmes wrote:
>>>>>>>> Hi Patricio,
>>>>>>>>
>>>>>>>> <trimming>
>>>>>>>>
>>>>>>>> First, thanks for all the many weeks of work you've put into 
>>>>>>>> this, pulling together a number of ideas from different people 
>>>>>>>> to make it all work!
>>>>>>> Thanks! Credit to you for the PlatformMonitor implementation : )
>>>>>>
>>>>>> :) Nothing innovative there.
>>>>>>
>>>>>>>> I've only got a few minor comments/suggestions.
>>>>>>>>
>>>>>>>> On 30/01/2019 10:24 am, Patricio Chilano wrote:
>>>>>>>>> Full: 
>>>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8210832/v03/webrev/
>>>>>>>>> Inc: 
>>>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8210832/v03/inc/webrev/
>>>>>>>>>
>>>>>>>>
>>>>>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp
>>>>>>>>
>>>>>>>> I'm very unclear how ThreadLockBlockInVM differs from 
>>>>>>>> ThreadBlockInVM. You've duplicated a lot of complex code which 
>>>>>>>> is masking the actual difference between the two wrappers to 
>>>>>>>> me. It seems to me that an extra arg to transition_and_fence 
>>>>>>>> should allow you to handle the new behaviour without having to 
>>>>>>>> duplicate so much of this code. In any case the semantics of 
>>>>>>>> ThreadLockBlockInVM needs to be described.
>>>>>>> I could do it with one extra argument, but I would need to add 
>>>>>>> two extra branches in transition_and_fence(), one to decide if 
>>>>>>> I'm in the Monitor case to avoid calling 
>>>>>>> SafepointMechanism::block_if_requested() directly and another 
>>>>>>> one to actually decide if I'm transitioning in or out, since the 
>>>>>>> actions to perform are different. I think it is easier to read 
>>>>>>> without adding new conditionals, and also we will save those 
>>>>>>> extra branches, but if you think it's better this way I can 
>>>>>>> change it.
>>>>>>
>>>>>> I would like something that tells me more clearly how this new 
>>>>>> transition helper differs from the existing TBIVM. Sharing the 
>>>>>> code between them and using different args would be one way. 
>>>>>> Documenting the difference in comments would be another. Your 
>>>>>> choice.
>>>>> Ok, I added a description on top of TLBIVM.
>>>>>
>>>>>>>> Also I'm unclear what the "Lock" in ThreadLockBlockInVM 
>>>>>>>> actually refers to. I find the name quite jarring to read.
>>>>>>> What about changing it to ThreadBlockinMonitor?
>>>>>>
>>>>>> That's not quite conveying the semantics. The problem is that the 
>>>>>> semantics we are changing compared to TBIVM are not evident in 
>>>>>> the TBIVM name. If TBIVM was actually 
>>>>>> ThreadBlockInVMWithSafepointBlocking, then this new transition 
>>>>>> would obviously be ThreadBlockInVMWithoutSafepointBlocking - but 
>>>>>> perhaps that lengthy, but clear name would be okay anyway?
>>>>> Not convinced on that name since we are blocking at safepoints in 
>>>>> the destructor. Based on the comments I added how about 
>>>>> ThreadBlockinVMWithDeadlockPrevention ? or 
>>>>> ThreadBlockinVMWithPreemption? (as in eliminate one of the 
>>>>> conditions for deadlocks).
>>>>>
>>>>>>>> On the subject of naming, do_preempt and preempt_by_safepoint 
>>>>>>>> don't really convey to me what happens - what is being 
>>>>>>>> "preempted" here? I would suggest a more direct 
>>>>>>>> Monitor::release_for_safepoint
>>>>>>> Changed.
>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> Logging: why "nativemonitor"? The logging in mutex.cpp doesn't 
>>>>>>>> relate to a "native" monitor?? Actually I'm not even sure if we 
>>>>>>>> need bother at all with the one logging statement that is present.
>>>>>>> I added it to eventually track unbounded try locks. Not sure I 
>>>>>>> follow you with the name, isn't that how we name this monitors? 
>>>>>>> I tried to differentiate them from Java monitors. What about 
>>>>>>> just "monitor"?
>>>>>>
>>>>>> How about vmmonitor ?
>>>>> Ok, changed.
>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/runtime/mutex.cpp
>>>>>>>>
>>>>>>>> void Monitor::lock_without_safepoint_check(Thread * self) {
>>>>>>>>   // Ensure that the Monitor does not require or allow 
>>>>>>>> safepoint checks.
>>>>>>>>
>>>>>>>> The comment there should only say "not require".
>>>>>>> Done.
>>>>>>>
>>>>>>>> void Monitor::preempt_by_safepoint() {
>>>>>>>>   _lock.unlock();
>>>>>>>> }
>>>>>>>>
>>>>>>>> Apart from renaming this as suggested above, aren't there any 
>>>>>>>> suitable assertions we should have here? safepoint-in-progress 
>>>>>>>> or handshake-in-progress? _owner == Thread::current?
>>>>>>> Ok, I added an assertion that owner should be NULL. Asserting 
>>>>>>> safepoint-in-progress does not really work because _state could 
>>>>>>> change to _not_synchronized right after you checked for it in 
>>>>>>> TLBIVM.
>>>>>>
>>>>>> Okay.
>>>>>>
>>>>>>>> Nit:
>>>>>>>>
>>>>>>>> assert(_owner == Thread::current(), "should be equal: owner=" 
>>>>>>>> INTPTR_FORMAT
>>>>>>>>                    ", self=" INTPTR_FORMAT, p2i(_owner), 
>>>>>>>> p2i(Thread::current()));
>>>>>>>>
>>>>>>>> with Dan's enhanced assertions there's an indentation issue. 
>>>>>>>> The second line should indent to the first comma, but that will 
>>>>>>>> make the second line extend way past 80 columns.
>>>>>>>>
>>>>>>>> Also you could factor that assertion for 
>>>>>>>> _owner==Thread::current() into its own function or macro to 
>>>>>>>> avoid the repetition.
>>>>>>> Corrected indentation based on Dan's reply to align with _owner.
>>>>>>
>>>>>> I though it should indent to the comma because it is a 
>>>>>> continuation of the same argument being passed to the assert 
>>>>>> "function". But I'm okay with Dan's suggestion.
>>>>>>
>>>>>> Factoring it into its own little function or macro would still be 
>>>>>> good to avoid the repetition.
>>>>> Ok, added new function assert_owned_by_self(). I could change it 
>>>>> to assert_owner(Thread*) and use it for the NULL case too unifying 
>>>>> the printed messages to something like "invalid owner: owner=" 
>>>>> INTPTR_FORMAT ", should be:" INTPTR_FORMAT. What do you think?
>>>>>
>>>>>>>>  OSThreadWaitState osts(self->osthread(), false /* not 
>>>>>>>> Object.wait() */);
>>>>>>>>
>>>>>>>> This needs to be returned to its original place as per Dan's 
>>>>>>>> comments.
>>>>>>> Done.
>>>>>>>
>>>>>>>>     } else {
>>>>>>>>       Monitor::lock(self);
>>>>>>>>     }
>>>>>>>>
>>>>>>>> You don't need Monitor:: here
>>>>>>> Removed.
>>>>>>>
>>>>>>>> // Temporary JVM_RawMonitor* support. A raw monitor can just be 
>>>>>>>> a PlatformMonitor now.
>>>>>>>>
>>>>>>>> This needs to be resolved before committing. Some of the 
>>>>>>>> existing commentary on what raw monitors are needs to be 
>>>>>>>> retained. Not clear if we need to set the _owner field or can 
>>>>>>>> just skip it.
>>>>>>> Is it okay if I keep the following comments?
>>>>>>>
>>>>>>> // Yet another degenerate version of Monitor::lock() or 
>>>>>>> lock_without_safepoint_check()
>>>>>>> // jvm_raw_lock() and _unlock() can be called by non-Java 
>>>>>>> threads via JVM_RawMonitorEnter.
>>>>>>> //
>>>>>>> // There's no expectation that JVM_RawMonitors will interoperate 
>>>>>>> properly with the native
>>>>>>> // Mutex-Monitor constructs.  We happen to implement 
>>>>>>> JVM_RawMonitors in terms of
>>>>>>> // native Mutex-Monitors simply as a matter of convenience.
>>>>>>
>>>>>> Yep that's perfect. And as a future RFE we can replace them with 
>>>>>> direct use of PlatformMonitor (or PlatformMutex).
>>>>>>
>>>>>>>
>>>>>>> I could keep setting the owner as _owner = 
>>>>>>> Thread::current_or_null() in jvm_raw_lock(), at least it 
>>>>>>> wouldn't hurt.
>>>>>>
>>>>>> It's useful for checking usage errors, but we won't have that if 
>>>>>> we replace with PlatformMonitor, so may as well drop it now IMO.
>>>>> Ok, I added asserts that _owner should be NULL after acquiring it 
>>>>> and before releasing it though.
>>>>>
>>>>>>>> Monitor::~Monitor() {
>>>>>>>>   assert(_owner == NULL, "should be NULL: owner=" 
>>>>>>>> INTPTR_FORMAT, p2i(_owner));
>>>>>>>> }
>>>>>>>>
>>>>>>>> Will this automatically result in the PlatformMonitor 
>>>>>>>> destructor being called?
>>>>>>> Yes, should I add a comment to make it clear that 
>>>>>>> ~PlatformMonitor() will be executed?
>>>>>>
>>>>>> No need - assume other people have a better understanding of C++ 
>>>>>> than I do :)
>>>>>
>>>>> Below is version v04, which also contains a correction pointed out 
>>>>> off-list by Robbin to do a local poll first in 
>>>>> SafepointMechanism::callback_if_safepoint() if we are using local 
>>>>> polls. Since the thread local poll is armed after changing _state 
>>>>> to _synchronizing, if we only do a global poll in the TLBIVM 
>>>>> constructor we could detect there is a safepoint in progress and 
>>>>> callback but when coming back in the destructor 
>>>>> SafepointMechanism::should_block() could miss detecting the 
>>>>> safepoint in progress since that method checks first for local polls.
>>>>>
>>>>>
>>>>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v04/
>>>>> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v04/inc/
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Patricio
>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks for looking into this! Waiting on your comments to send v04.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Patricio
>>>>>>>> ---
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>
>