RFR: 8210832: Remove sneaky locking in class Monitor
Per Liden
per.liden at oracle.com
Tue Jan 29 13:36:39 UTC 2019
On 01/29/2019 01:13 PM, David Holmes wrote:
> On 29/01/2019 8:26 pm, Per Liden wrote:
>> On 01/29/2019 10:53 AM, David Holmes wrote:
>>> Hi Per,
>>>
>>> If I may jump in on one thing you suggest ... destructors. Do we ever
>>> actually destroy Mutex or Monitor instances? There are inherent races
>>> that can make it very dangerous to try and actually delete the
>>> low-level PlatformMonitor and destroy the pthread_mutex or
>>> pthread_cond, and even release the memory used. The related
>>> PlatformEvent and PlatformParker are expected to be immortal and I
>>> think that is the same for PlatformMonitor.
>>
>> There are examples of where we do destroy Mutex instances today (like
>> JVM_RawMonitorDestroy), and I don't think that's something the API
>> should forbid.
>
> RawMonitors are used very rarely and I would not have much confidence
> that their destruction is actually done in a safe manner.
That was just one example, there are many other places where do to this
too. Many locks don't have the same lifecycle of the VM.
>
>>
>> As Robbin hinted, I'm hoping ZLock (which is just a plain
>> pthread_mutex) in ZGC can be converted to be a plain mutex using
>> PlatforMutex in the future. ZLocks require dynamic
>> creation/destruction to be supported as they are e.g. attached to
>> nmethods which can come and go.
>
> I think this is something that requires some detailed consideration. You
Destroying a mutex is no different from destroying any other shared data
structure, and we do that all the time in hotspot.
> need to have very robust lifecycle management to be able to delete the
> low-level implementation objects safely. (It's not sufficient for
> example to test the ability to destroy a pthread_mutex by checking if
> you can first acquire it (and release it again) then destroy it and
> deallocate it, as the previous owner could still be executing inside
> pthread_mutex_unlock!).
Well, that would be an absurd and completely broken scheme, to say the
least ;)
cheers,
Per
>
> In the context of the current changes though we do have an inconsistency
> between the high-level Mutex/Monitor and the PlatformMonitor. This was
> an oversight on my part when doing some earlier work on this.
>
> David
> -----
>
>>>
>>> Aside: I don't think distinct PlatformMutex and PlatforMonitor is
>>> worth the effort unless we also rework the Mutex/Monitor relationship
>>> as well.
>>
>> I completely agree, Mutex/Monitor would also need to be reworked a bit.
>>
>> cheers,
>> Per
>>
>>>
>>> Cheers,
>>> David
>>>
>>> On 29/01/2019 7:22 pm, Per Liden wrote:
>>>> Hi Patricio,
>>>>
>>>> On 01/28/2019 08:18 PM, Patricio Chilano wrote:
>>>>> Hi Robbin,
>>>>>
>>>>> Thanks for reviewing this! Removing the block_in_safepoint_check
>>>>> thread local attribute is a great idea, here is v02:
>>>>>
>>>>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev
>>>>
>>>> I really like that we're ditching our old locking code in favor of
>>>> using pthread_mutex, et al. Nice work!
>>>>
>>>>
>>>> General comment
>>>> ----------------
>>>> I think Mutex to be a plain mutex and not come with the baggage of
>>>> having a conditional variable. With this new code, it seems we're in
>>>> a really good position to make that happen. I.e. something like this:
>>>>
>>>> class PlatformMutex {
>>>> protected:
>>>> pthread_mutex_t _mutex;
>>>>
>>>> public:
>>>> PlatformMutex();
>>>> ~PlatformMutex();
>>>>
>>>> void lock();
>>>> void unlock();
>>>> bool try_lock();
>>>> };
>>>>
>>>> class PlatformMonitor : public PlatformMutex {
>>>> private:
>>>> pthread_cond_t _cond;
>>>>
>>>> public:
>>>> PlatformMonitor();
>>>> ~PlatformMonitor();
>>>>
>>>> int wait(jlong millis);
>>>> void notify();
>>>> void notify_all();
>>>> };
>>>>
>>>> It might be that we want to do that as a separate step later instead
>>>> of including it in this patch. But I think we should try to get there.
>>>>
>>>>
>>>> src/hotspot/os/posix/os_*.[ch]pp
>>>> ---------------------------------
>>>> * I'd suggest that we place the PlatformMonitor class in a separate
>>>> file (like src/hotspot/os/posix/monitor_posix.cpp), just like we
>>>> have done with Semaphore (in src/hotspot/os/posix/semaphore_posix.cpp).
>>>>
>>>>
>>>> src/hotspot/os/posix/os_posix.hpp
>>>> src/hotspot/os/solaris/os_solaris.hpp
>>>> src/hotspot/os/windows/os_windows.hpp
>>>> -------------------------------------
>>>> * Please make _mutex/_cond plain variables, instead of arrays of 1.
>>>> That's just ugly ;)
>>>>
>>>>
>>>> src/hotspot/os/posix/os_posix.cpp
>>>> ---------------------------------
>>>> * Destructor missing, to call pthread_(mutex|cond)_destroy().
>>>>
>>>>
>>>> src/hotspot/os/solaris/os_solaris.hpp
>>>> -------------------------------------
>>>> * Not sure if there's a good reason to have the constructor be
>>>> inlined here. I'd suggest moving it to the cpp file.
>>>>
>>>> * Destructor missing.
>>>>
>>>>
>>>> src/hotspot/os/windows/os_windows.cpp
>>>> -------------------------------------
>>>> * Destructor missing (I'm not too familiar with the windows API but
>>>> I assume there's a destroy function we should call here).
>>>>
>>>>
>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp
>>>> -----------------------------------------------------
>>>> * Move "private:" above monitor_adr;
>>>>
>>>> 289 class ThreadLockBlockInVM : public ThreadStateTransition {
>>>> 290 Monitor** monitor_adr;
>>>> 291 private:
>>>> 292 void do_preempted(Monitor** in_flight_monitor_adr) {
>>>>
>>>> * monitor_adr should be _monitor_adr, or maybe even
>>>> _in_flight_monitor_adr to better match the name of the argument.
>>>>
>>>>
>>>> cheers,
>>>> Per
>>>>
>>>>> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/
>>>>>
>>>>> Running mach5 again.
>>>>>
>>>>> Thanks,
>>>>> Patricio
>>>>>
>>>>> On 1/28/19 8:31 AM, Robbin Ehn wrote:
>>>>>> Hi Patricio,
>>>>>>
>>>>>> Mostly looks good!
>>>>>>
>>>>>> block_at_safepoint is always called with block_in_safepoint_check
>>>>>> = true. (correct?)
>>>>>> Changing that to a local state instead of global simplifies the code.
>>>>>>
>>>>>> So I'm suggesting something like below.
>>>>>>
>>>>>> Thanks, Robbin
>>>>>>
>>>>>> diff -r e65cc445234c
>>>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp
>>>>>> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon
>>>>>> Jan 28 13:10:15 2019 +0100
>>>>>> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon
>>>>>> Jan 28 14:10:59 2019 +0100
>>>>>> @@ -308,2 +308,1 @@
>>>>>> - thread->block_in_safepoint_check = false;
>>>>>> - SafepointMechanism::block_at_safepoint(thread);
>>>>>> + SafepointMechanism::callback_if_safepoint(thread);
>>>>>> @@ -323,2 +322,1 @@
>>>>>> - SafepointMechanism::block_at_safepoint(_thread);
>>>>>> - _thread->block_in_safepoint_check = true;
>>>>>> + SafepointMechanism::callback_if_safepoint(_thread);
>>>>>> @@ -335,2 +332,0 @@
>>>>>> - } else {
>>>>>> - _thread->block_in_safepoint_check = true;
>>>>>> @@ -337,0 +334,1 @@
>>>>>> +
>>>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp
>>>>>> --- a/src/hotspot/share/runtime/safepoint.cpp Mon Jan 28
>>>>>> 13:10:15 2019 +0100
>>>>>> +++ b/src/hotspot/share/runtime/safepoint.cpp Mon Jan 28
>>>>>> 14:10:59 2019 +0100
>>>>>> @@ -795,1 +795,1 @@
>>>>>> -void SafepointSynchronize::block(JavaThread *thread) {
>>>>>> +void SafepointSynchronize::block(JavaThread *thread, bool
>>>>>> block_in_safepoint_check) {
>>>>>> @@ -850,1 +850,1 @@
>>>>>> - if (thread->block_in_safepoint_check) {
>>>>>> + if (block_in_safepoint_check) {
>>>>>> @@ -880,1 +880,1 @@
>>>>>> - thread->block_in_safepoint_check) {
>>>>>> + block_in_safepoint_check) {
>>>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp
>>>>>> --- a/src/hotspot/share/runtime/safepoint.hpp Mon Jan 28
>>>>>> 13:10:15 2019 +0100
>>>>>> +++ b/src/hotspot/share/runtime/safepoint.hpp Mon Jan 28
>>>>>> 14:10:59 2019 +0100
>>>>>> @@ -146,1 +146,1 @@
>>>>>> - static void block(JavaThread *thread);
>>>>>> + static void block(JavaThread *thread, bool
>>>>>> block_in_safepoint_check = true);
>>>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp
>>>>>> --- a/src/hotspot/share/runtime/safepointMechanism.hpp Mon Jan
>>>>>> 28 13:10:15 2019 +0100
>>>>>> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp Mon Jan
>>>>>> 28 14:10:59 2019 +0100
>>>>>> @@ -82,1 +82,1 @@
>>>>>> - static inline void block_at_safepoint(JavaThread* thread);
>>>>>> + static inline void callback_if_safepoint(JavaThread* thread);
>>>>>> diff -r e65cc445234c
>>>>>> src/hotspot/share/runtime/safepointMechanism.inline.hpp
>>>>>> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon
>>>>>> Jan 28 13:10:15 2019 +0100
>>>>>> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon
>>>>>> Jan 28 14:10:59 2019 +0100
>>>>>> @@ -82,1 +82,1 @@
>>>>>> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) {
>>>>>> +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) {
>>>>>> @@ -84,1 +84,1 @@
>>>>>> - SafepointSynchronize::block(thread);
>>>>>> + SafepointSynchronize::block(thread, false);
>>>>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp
>>>>>> --- a/src/hotspot/share/runtime/thread.cpp Mon Jan 28 13:10:15
>>>>>> 2019 +0100
>>>>>> +++ b/src/hotspot/share/runtime/thread.cpp Mon Jan 28 14:10:59
>>>>>> 2019 +0100
>>>>>> @@ -298,2 +297,0 @@
>>>>>> - block_in_safepoint_check = true;
>>>>>> -
>>>>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp
>>>>>> --- a/src/hotspot/share/runtime/thread.hpp Mon Jan 28 13:10:15
>>>>>> 2019 +0100
>>>>>> +++ b/src/hotspot/share/runtime/thread.hpp Mon Jan 28 14:10:59
>>>>>> 2019 +0100
>>>>>> @@ -788,2 +787,0 @@
>>>>>> - bool block_in_safepoint_check; // to decide
>>>>>> whether to block in SS::block or not
>>>>>> -
>>>>>>
>>>>>>
>>>>>> On 1/28/19 9:42 AM, Patricio Chilano wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Please review the following patch:
>>>>>>>
>>>>>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832
>>>>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/
>>>>>>>
>>>>>>> The current implementation of native monitors uses a technique
>>>>>>> that we name "sneaky locking" to prevent possible deadlocks of
>>>>>>> the JVM during safepoints. The implementation of this technique
>>>>>>> though introduces a race when a monitor is shared between the
>>>>>>> VMThread and non-JavaThreads. This patch aims to solve that
>>>>>>> problem and at the same time simplify the code.
>>>>>>>
>>>>>>> The proposal is based on the introduction of the new class
>>>>>>> PlatformMonitor, which serves as a wrapper for the actual
>>>>>>> synchronization primitives in each platform (mutexes and
>>>>>>> condition variables). Most of the API calls can thus be
>>>>>>> implemented as simple wrappers around PlatformMonitor, adding
>>>>>>> more assertions and very little extra metadata.
>>>>>>> To be able to remove the lock sneaking code and at the same time
>>>>>>> avoid deadlocking scenarios, we combine two techniques:
>>>>>>>
>>>>>>> -When a JavaThread that has just acquired the lock, detects there
>>>>>>> is a safepoint request in the ThreadLockBlockInVM destructor, it
>>>>>>> releases the lock before blocking at the safepoint. After
>>>>>>> resuming from it, the JavaThread will have to acquire the lock
>>>>>>> again.
>>>>>>>
>>>>>>> - In the ThreadLockBlockInVM constructor for the Monitor::wait()
>>>>>>> method, in order to avoid blocking we allow for a possible
>>>>>>> safepoint request to make progress but without letting the
>>>>>>> JavaThread block for it (since we would be stopped by the
>>>>>>> destructor anyways). We also do that for the Monitor::lock() case
>>>>>>> although no deadlock is being prevented there.
>>>>>>>
>>>>>>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition
>>>>>>> class used instead of the ThreadBlockInVM one. This allowed more
>>>>>>> flexibility to handle the two techniques mentioned above. Also,
>>>>>>> ThreadBlockInVM calls SafepointMechanism::block_if_requested()
>>>>>>> which creates some problems when trying to allow safepoints to
>>>>>>> continue without stopping, since that method not only checks for
>>>>>>> safepoints but also processes handshakes.
>>>>>>>
>>>>>>> In terms of performance, benchmarks show very similar results to
>>>>>>> what we have now.
>>>>>>>
>>>>>>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have
>>>>>>> been tested.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Patricio
>>>>>>>
>>>>>
More information about the hotspot-dev
mailing list