RFR: 8210832: Remove sneaky locking in class Monitor

Fri Feb 1 19:22:32 UTC 2019

On 2/1/19 1:01 PM, Patricio Chilano wrote:
> Hi Dan,
>
> On 2/1/19 11:10 AM, Daniel D. Daugherty wrote:
>> > Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v04/inc/
>>
>> src/hotspot/share/logging/logTag.hpp
>>     No comments.
>>
>> src/hotspot/share/runtime/interfaceSupport.inline.hpp
>>     L295: // - When transitioning in (constructor), it checks for 
>> safepoints without having to block, i.e calls
>>         Perhaps:
>>           // - When transitioning in (constructor), it checks for 
>> safepoints without blocking, i.e., calls
>>
>>     L298: //   the monitor that is in the process of being acquired 
>> (only setting the owner is missing).
>>         Perhaps:
>>           //   the monitor that is only partially acquired.
> Done.
>> src/hotspot/share/runtime/mutex.cpp
>>     L59: log_trace(vmmonitor)("JavaThread " INTPTR_FORMAT " on %d 
>> attempt trying to acquire native monitor %s", p2i(self), retry_cnt, 
>> _name);
>>         nit - s/native monitor/vmmonitor/  - for consistency.
> Done.
>
>> src/hotspot/share/runtime/mutex.hpp
>>     No comments.
>>
>> src/hotspot/share/runtime/safepointMechanism.inline.hpp
>>     L83:   if(!uses_thread_local_poll() || local_poll_armed(thread)) {
>>     L84:     if (global_poll()) {
>>         I'm having trouble with this construct:
>>
>>         - if uses_thread_local_poll() is false, then we check 
>> global_poll()
>>           and call SafepointSynchronize::block() if global_poll() == 
>> true.
>>           That makes sense since we're not using local polling.
>>
>>         - if uses_thread_local_poll() is true, then we check
>>           if local_poll_armed(thread) == true and then we check 
>> global_poll()
>>           and call SafepointSynchronize::block() if global_poll() == 
>> true.
>>
>>           Huh?
>>
>>         So we only call SafepointSynchronize::block() if 
>> global_poll() == true
>>         no matter what. It seems like the checks for local polling 
>> don't really
>>         do anything here.
>>
>>         What am I missing?
> A couple of observations before going straight to the question. The 
> check for local poll in SafepointMechanism::should_block() (TBIVMWDP 
> destructor) is needed because we need to check for both, pending 
> safepoints or pending handshakes. In other words once we transition 
> back from the _thread_blocked state, there could be either a safepoint 
> or a handshake for that thread being processed. If 
> SafepointMechanism::should_block() returns true, then 
> SafepointMechanism::callback_if_safepoint() only checks for pending 
> safepoints. After that I do the check "if (_thread->has_handshake())" 
> for pending handshakes. I couldn't use 
> SafepointMechanism::block_if_requested() which checks for both like in 
> the normal TBIVM, because of two reasons:
> - On one hand, I need to pass the "false" parameter to 
> SafepointSynchronize::block() to avoid blocking when transitioning in, 
> and avoid failing the test that we have a ThreadSafepointState of 
> _call_back when transitioing back. Without introducing 
> SafepointMechanism::callback_if_safepoint() I would have to modify a 
> couple of methods in SafepointMechanism to receive that extra 
> argument. I think this was cleaner.
> -But even if I did add the extra argument to those functions and use 
> SafepointMechanism::block_if_requested(), I would need also to avoid 
> checking for handshakes in the TBIVMWDP constructor. The processing of 
> handshakes uses its own ThreadInVMForHandshake jacket which internally 
> calls SafepointMechanism::block_if_requested(). Since in the 
> constructor we do not block for safepoints, executing again 
> SafepointMechanism::block_if_requested() in a handshake would execute 
> SS::block twice.
> So to solve that I just introduced 
> SafepointMechanism::callback_if_safepoint() which only checks for 
> safepoints and uses the extra argument in SS::block.
>
> Now, as to why I can't just check for global polls in 
> SafepointMechanism::callback_if_safepoint() (since I only want to 
> check/stop for safepoints), this has to do with the race found by 
> Robbin. Since the thread local poll is armed after changing _state to 
> _synchronizing, if I only do a global poll, 
> SafepointMechanism::callback_if_safepoint(in the TBIVMWDP constructor) 
> could detect there is a safepoint in progress and callback but when 
> coming back in the destructor SafepointMechanism::should_block() could 
> miss detecting the safepoint in progress since that method checks for 
> local polls first (it cares about both safepoints and handshakes). So 
> I need to match the order of checks done by 
> SafepointMechanism::should_block().
>
> Any of this makes sense?

Not really, but now I have a better way to ask about this.

Here's should_block():

   50 bool SafepointMechanism::should_block(Thread* thread) {
   51   if (uses_thread_local_poll()) {
   52     return local_poll(thread);
   53   } else {
   54     return global_poll();
   55   }
   56 }

If uses_thread_local_poll() == true, then we return the result
of local_poll(thread) and skip calling global_poll().

Here's callback_if_safepoint():

   82 void SafepointMechanism::callback_if_safepoint(JavaThread* thread) {
   83   if(!uses_thread_local_poll() || local_poll_armed(thread)) {
   84     if (global_poll()) {
   85       SafepointSynchronize::block(thread, false);
   86     }
   87   }
   88 }

If uses_thread_local_poll() == false and global_poll() == true,
then we call block().
If uses_thread_local_poll() == true and local_poll_armed() == true
and global_poll() == true, then we call block().

Since we always check global_poll() == true before calling block(),
then preceding expressions are moot. They do not affect whether we
call block() at all.

Dan

>
>
> Thanks,
> Patricio
>
>> Dan
>>
>> On 2/1/19 12:05 AM, Patricio Chilano wrote:
>>> Hi David,
>>>
>>> On 1/31/19 12:54 AM, David Holmes wrote:
>>>> On 31/01/2019 7:37 am, Patricio Chilano wrote:
>>>>> Hi David,
>>>>>
>>>>> On 1/30/19 2:29 AM, David Holmes wrote:
>>>>>> Hi Patricio,
>>>>>>
>>>>>> <trimming>
>>>>>>
>>>>>> First, thanks for all the many weeks of work you've put into 
>>>>>> this, pulling together a number of ideas from different people to 
>>>>>> make it all work!
>>>>> Thanks! Credit to you for the PlatformMonitor implementation : )
>>>>
>>>> :) Nothing innovative there.
>>>>
>>>>>> I've only got a few minor comments/suggestions.
>>>>>>
>>>>>> On 30/01/2019 10:24 am, Patricio Chilano wrote:
>>>>>>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/webrev/
>>>>>>> Inc: 
>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8210832/v03/inc/webrev/
>>>>>>>
>>>>>>
>>>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp
>>>>>>
>>>>>> I'm very unclear how ThreadLockBlockInVM differs from 
>>>>>> ThreadBlockInVM. You've duplicated a lot of complex code which is 
>>>>>> masking the actual difference between the two wrappers to me. It 
>>>>>> seems to me that an extra arg to transition_and_fence should 
>>>>>> allow you to handle the new behaviour without having to duplicate 
>>>>>> so much of this code. In any case the semantics of 
>>>>>> ThreadLockBlockInVM needs to be described.
>>>>> I could do it with one extra argument, but I would need to add two 
>>>>> extra branches in transition_and_fence(), one to decide if I'm in 
>>>>> the Monitor case to avoid calling 
>>>>> SafepointMechanism::block_if_requested() directly and another one 
>>>>> to actually decide if I'm transitioning in or out, since the 
>>>>> actions to perform are different. I think it is easier to read 
>>>>> without adding new conditionals, and also we will save those extra 
>>>>> branches, but if you think it's better this way I can change it.
>>>>
>>>> I would like something that tells me more clearly how this new 
>>>> transition helper differs from the existing TBIVM. Sharing the code 
>>>> between them and using different args would be one way. Documenting 
>>>> the difference in comments would be another. Your choice.
>>> Ok, I added a description on top of TLBIVM.
>>>
>>>>>> Also I'm unclear what the "Lock" in ThreadLockBlockInVM actually 
>>>>>> refers to. I find the name quite jarring to read.
>>>>> What about changing it to ThreadBlockinMonitor?
>>>>
>>>> That's not quite conveying the semantics. The problem is that the 
>>>> semantics we are changing compared to TBIVM are not evident in the 
>>>> TBIVM name. If TBIVM was actually 
>>>> ThreadBlockInVMWithSafepointBlocking, then this new transition 
>>>> would obviously be ThreadBlockInVMWithoutSafepointBlocking - but 
>>>> perhaps that lengthy, but clear name would be okay anyway?
>>> Not convinced on that name since we are blocking at safepoints in 
>>> the destructor. Based on the comments I added how about 
>>> ThreadBlockinVMWithDeadlockPrevention ? or 
>>> ThreadBlockinVMWithPreemption? (as in eliminate one of the 
>>> conditions for deadlocks).
>>>
>>>>>> On the subject of naming, do_preempt and preempt_by_safepoint 
>>>>>> don't really convey to me what happens - what is being 
>>>>>> "preempted" here? I would suggest a more direct 
>>>>>> Monitor::release_for_safepoint
>>>>> Changed.
>>>>>
>>>>>> ---
>>>>>>
>>>>>> Logging: why "nativemonitor"? The logging in mutex.cpp doesn't 
>>>>>> relate to a "native" monitor?? Actually I'm not even sure if we 
>>>>>> need bother at all with the one logging statement that is present.
>>>>> I added it to eventually track unbounded try locks. Not sure I 
>>>>> follow you with the name, isn't that how we name this monitors? I 
>>>>> tried to differentiate them from Java monitors. What about just 
>>>>> "monitor"?
>>>>
>>>> How about vmmonitor ?
>>> Ok, changed.
>>>
>>>>>> ---
>>>>>>
>>>>>> src/hotspot/share/runtime/mutex.cpp
>>>>>>
>>>>>> void Monitor::lock_without_safepoint_check(Thread * self) {
>>>>>>   // Ensure that the Monitor does not require or allow safepoint 
>>>>>> checks.
>>>>>>
>>>>>> The comment there should only say "not require".
>>>>> Done.
>>>>>
>>>>>> void Monitor::preempt_by_safepoint() {
>>>>>>   _lock.unlock();
>>>>>> }
>>>>>>
>>>>>> Apart from renaming this as suggested above, aren't there any 
>>>>>> suitable assertions we should have here? safepoint-in-progress or 
>>>>>> handshake-in-progress? _owner == Thread::current?
>>>>> Ok, I added an assertion that owner should be NULL. Asserting 
>>>>> safepoint-in-progress does not really work because _state could 
>>>>> change to _not_synchronized right after you checked for it in TLBIVM.
>>>>
>>>> Okay.
>>>>
>>>>>> Nit:
>>>>>>
>>>>>> assert(_owner == Thread::current(), "should be equal: owner=" 
>>>>>> INTPTR_FORMAT
>>>>>>                    ", self=" INTPTR_FORMAT, p2i(_owner), 
>>>>>> p2i(Thread::current()));
>>>>>>
>>>>>> with Dan's enhanced assertions there's an indentation issue. The 
>>>>>> second line should indent to the first comma, but that will make 
>>>>>> the second line extend way past 80 columns.
>>>>>>
>>>>>> Also you could factor that assertion for 
>>>>>> _owner==Thread::current() into its own function or macro to avoid 
>>>>>> the repetition.
>>>>> Corrected indentation based on Dan's reply to align with _owner.
>>>>
>>>> I though it should indent to the comma because it is a continuation 
>>>> of the same argument being passed to the assert "function". But I'm 
>>>> okay with Dan's suggestion.
>>>>
>>>> Factoring it into its own little function or macro would still be 
>>>> good to avoid the repetition.
>>> Ok, added new function assert_owned_by_self(). I could change it to 
>>> assert_owner(Thread*) and use it for the NULL case too unifying the 
>>> printed messages to something like "invalid owner: owner=" 
>>> INTPTR_FORMAT ", should be:" INTPTR_FORMAT. What do you think?
>>>
>>>>>>  OSThreadWaitState osts(self->osthread(), false /* not 
>>>>>> Object.wait() */);
>>>>>>
>>>>>> This needs to be returned to its original place as per Dan's 
>>>>>> comments.
>>>>> Done.
>>>>>
>>>>>>     } else {
>>>>>>       Monitor::lock(self);
>>>>>>     }
>>>>>>
>>>>>> You don't need Monitor:: here
>>>>> Removed.
>>>>>
>>>>>> // Temporary JVM_RawMonitor* support. A raw monitor can just be a 
>>>>>> PlatformMonitor now.
>>>>>>
>>>>>> This needs to be resolved before committing. Some of the existing 
>>>>>> commentary on what raw monitors are needs to be retained. Not 
>>>>>> clear if we need to set the _owner field or can just skip it.
>>>>> Is it okay if I keep the following comments?
>>>>>
>>>>> // Yet another degenerate version of Monitor::lock() or 
>>>>> lock_without_safepoint_check()
>>>>> // jvm_raw_lock() and _unlock() can be called by non-Java threads 
>>>>> via JVM_RawMonitorEnter.
>>>>> //
>>>>> // There's no expectation that JVM_RawMonitors will interoperate 
>>>>> properly with the native
>>>>> // Mutex-Monitor constructs.  We happen to implement 
>>>>> JVM_RawMonitors in terms of
>>>>> // native Mutex-Monitors simply as a matter of convenience.
>>>>
>>>> Yep that's perfect. And as a future RFE we can replace them with 
>>>> direct use of PlatformMonitor (or PlatformMutex).
>>>>
>>>>>
>>>>> I could keep setting the owner as _owner = 
>>>>> Thread::current_or_null() in jvm_raw_lock(), at least it wouldn't 
>>>>> hurt.
>>>>
>>>> It's useful for checking usage errors, but we won't have that if we 
>>>> replace with PlatformMonitor, so may as well drop it now IMO.
>>> Ok, I added asserts that _owner should be NULL after acquiring it 
>>> and before releasing it though.
>>>
>>>>>> Monitor::~Monitor() {
>>>>>>   assert(_owner == NULL, "should be NULL: owner=" INTPTR_FORMAT, 
>>>>>> p2i(_owner));
>>>>>> }
>>>>>>
>>>>>> Will this automatically result in the PlatformMonitor destructor 
>>>>>> being called?
>>>>> Yes, should I add a comment to make it clear that 
>>>>> ~PlatformMonitor() will be executed?
>>>>
>>>> No need - assume other people have a better understanding of C++ 
>>>> than I do :)
>>>
>>> Below is version v04, which also contains a correction pointed out 
>>> off-list by Robbin to do a local poll first in 
>>> SafepointMechanism::callback_if_safepoint() if we are using local 
>>> polls. Since the thread local poll is armed after changing _state to 
>>> _synchronizing, if we only do a global poll in the TLBIVM 
>>> constructor we could detect there is a safepoint in progress and 
>>> callback but when coming back in the destructor 
>>> SafepointMechanism::should_block() could miss detecting the 
>>> safepoint in progress since that method checks first for local polls.
>>>
>>>
>>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v04/
>>> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v04/inc/
>>>
>>>
>>> Thanks,
>>> Patricio
>>>
>>>> Thanks,
>>>> David
>>>>
>>>>>
>>>>>
>>>>> Thanks for looking into this! Waiting on your comments to send v04.
>>>>>
>>>>> Thanks,
>>>>> Patricio
>>>>>> ---
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>
>>>
>>
>