RFR:8047290:Ensure consistent safepoint checking in MutexLockerEx

Coleen Phillimore coleen.phillimore at oracle.com
Fri Dec 5 20:08:40 UTC 2014


Max,

I think these changes look great!  There were some comments about making 
this more of a static check with types or detecting deadlocks within the 
safepointing code, but the change here is a good first step and does so 
with minimal disruption to the code and minimal complexity.  So I 
consider this low-hanging fruit in order to make the code safer.

Thanks!
Coleen


On 12/4/14, 4:50 PM, Max Ockner wrote:
> Hello once again...
> I have a new (and suggestively titled) webrev: 
> http://cr.openjdk.java.net/~coleenp/8047290final/
>
> Ran aurora tests.
>
> Here is a list of "sometimes" locks:
>
> "WorkGroup monitor" share/vm/utilities/workgroup.cpp
> "SLTMonitor" share/vm/gc_implementation/shared/concurrentGCThread.cpp
> "CompactibleFreeListSpace._lock" 
> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp
> "freelist par lock" 
> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp
> "SR_lock"                            share/vm/runtime/thread.cpp
> "CMS_markBitMap_lock" 
> share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp
>
> The remaining "sometimes" locks can be found in 
> share/vm/runtime/mutexLocker.cpp:
>
> ParGCRareEvent_lock
> Safepoint_lock
> Threads_lock
> VMOperationQueue_lock
> VMOperationRequest_lock
> Terminator_lock
> Heap_lock
> Compile_lock
> PeriodicTask_lock
> JfrStacktrace_lock
>
>
> Thanks,
> Max Ockner
>
>
>
>
> On 11/25/2014 3:03 AM, David Holmes wrote:
>> Hi Max,
>>
>> On 21/11/2014 7:56 AM, Max Ockner wrote:
>>> Hello again,
>>>
>>> I have made changes based on all comments. There is now a pair of 
>>> assert
>>> statements in the Monitor/Mutex wait() methods. When I reran tests, I
>>
>> Is there an updated webrev?
>>
>>> caught another lock that I had to change to "sometimes", but the assert
>>> that caught this lock was not in wait. There are currently no locks 
>>> that
>>> use try to pass an incorrect safepoint check argument to wait().
>>> Instead, gcbasher did not catch this lock last time, when the only
>>> asserts were in lock and lock_without_safepoint. This lock is
>>> "CMS_markBitMap_lock" in
>>> share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp. 
>>>
>>> I'm guessing that it was not caught by the tests because this 
>>> section of
>>> code is not reached very often. My initial inspection failed to catch
>>> this lock because it is passed around between various structures and
>>> renamed 4 times. I have not yet found a good way to check for this
>>> situation, even with a debugger.
>>>
>>> Are there any tests which actually manage to hit every line of code?
>>
>> No. There is too much code that is dependent on low-level details of 
>> the GC used, the compilation strategy, plus the set of runtime flags 
>> used (and whether product or fastdebug). That's why we have lots of 
>> tests run in lots of different ways, to try to get coverage.
>>
>>> How should I handle this situation where I can't rely on the tests that
>>> I have run to tell me if I have missed something?
>>> At what point can I assume that everything is OK?
>>
>> Difficult to answer in general - there are a number of recommended 
>> test suites used by the runtime team, but your changes will also 
>> impact GC and compiler code and so may not get exercised by the 
>> runtime test suites (unless run with various compiler and GC 
>> options). Perhaps an ad-hoc test run similar to nightlies? Or you 
>> push after the weekly snapshot so as to maximise nightly testing, and 
>> if there are issues exposed then you have time to address them or 
>> revert the fix.
>>
>> Cheers,
>> David
>>
>>> Thanks,
>>> Max Ockner
>>>
>>> On 11/10/2014 11:57 PM, David Holmes wrote:
>>>> Hi Max,
>>>>
>>>> On 8/11/2014 2:15 AM, Max Ockner wrote:
>>>>> Hello all,
>>>>> I have made these additonal changes:
>>>>> -Moved the assert() statements into the lock and 
>>>>> lock_without_safepoint
>>>>> methods.
>>>>> -Changed Monitor::SafepointAllowed to Monitor::SafepointCheckRequired
>>>>> -Changed the Monitor::SafepointCheckRequired values for several locks
>>>>> which were locked outside of a MutexLockerEx (some were locked with
>>>>> MutexLocker, some were locked were locked without any MutexLocker* )
>>>>>
>>>>> New webrev location: http://cr.openjdk.java.net/~coleenp/8047290/
>>>>
>>>> Generally this is all okay - a few style and other nits below.
>>>>
>>>> However you missed adding an assert in Monitor::wait to check if the
>>>> no_safepoint_check flag was being used correctly for the current 
>>>> monitor.
>>>>
>>>> Specific comments:
>>>>
>>>> src/share/vm/runtime/mutex.hpp
>>>>
>>>> This comment is no longer accurate with the moved check location:
>>>>
>>>> + // MutexLockerEx checks these flags when acquiring a lock
>>>> + // to ensure consistent checking for each lock.
>>>>
>>>> The same goes for other references to MutexLockerEx in the enum
>>>> description.
>>>>
>>>> Also copyright year needs updating.
>>>>
>>>> ---
>>>>
>>>> src/share/vm/runtime/mutex.cpp
>>>>
>>>>  898   //Ensure
>>>>  961   //Ensure
>>>>
>>>> Space needed after //
>>>>
>>>> ---
>>>>
>>>> src/share/vm/runtime/mutexLocker.cpp
>>>>
>>>> +  var = new type(Mutex::pri, #var, 
>>>> vm_block,safepoint_check_allowed); \
>>>>
>>>> space needed after comma in k,s
>>>>
>>>> ---
>>>>
>>>> src/share/vm/runtime/mutexLocker.hpp
>>>>
>>>> Whitespace only changes - looks like leftovers from removed edits.
>>>>
>>>>
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>
>>>>> Additional testing:
>>>>> jtreg ./jdk/test/java/lang/invoke
>>>>> jtreg jfr tests
>>>>>
>>>>> Here is a list of ALL of the "sometimes" locks:
>>>>>
>>>>> "WorkGroup monitor" share/vm/utilities/workgroup.cpp
>>>>> "SLTMonitor" share/vm/gc_implementation/shared/concurrentGCThread.cpp
>>>>> "CompactibleFreeListSpace._lock"
>>>>> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp 
>>>>>
>>>>>
>>>>> "freelist par lock"
>>>>> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp 
>>>>>
>>>>>
>>>>> "SR_lock" share/vm/runtime/thread.cpp
>>>>>
>>>>> The remaining "sometimes" locks can be found in
>>>>> share/vm/runtime/mutexLocker.cpp:
>>>>>
>>>>> ParGCRareEvent_lock
>>>>> Safepoint_lock
>>>>> Threads_lock
>>>>> VMOperationQueue_lock
>>>>> VMOperationRequest_lock
>>>>> Terminator_lock
>>>>> Heap_lock
>>>>> Compile_lock
>>>>> PeriodicTask_lock
>>>>> JfrStacktrace_lock
>>>>>
>>>>> I have not checked the validity of the "sometimes" locks, and I 
>>>>> believe
>>>>> that this should be a different project.
>>>>>
>>>>> Thanks for your help!
>>>>> Max Ockner
>>>>> On 10/15/2014 8:54 PM, David Holmes wrote:
>>>>>> Hi Max,
>>>>>>
>>>>>> This is looking good.
>>>>>>
>>>>>> A few high-level initial comments:
>>>>>>
>>>>>> I think SafepointAllowed should be SafepointCheckNeeded
>>>>>>
>>>>>> Why are the checks in MutexLocker when the state is maintained in 
>>>>>> the
>>>>>> mutex itself and the mutex/monitor has lock_without_safepoint, and
>>>>>> wait(<safepoint checking flag>) ? I would have expected to see the
>>>>>> check in the mutex/monitor methods.
>>>>>>
>>>>>> Checking consistent usage of the _no_safepoint_check_flag is 
>>>>>> good. But
>>>>>> another part of this is that a monitor/mutex that never checks for
>>>>>> safepoints should never be held when a thread blocks at a 
>>>>>> safepoint -
>>>>>> is there some way to easily check that? I was surprised how many 
>>>>>> locks
>>>>>> are actually not checking for safepoints.
>>>>>>
>>>>>> Did you find any cases where the mutex/monitor was being used
>>>>>> inconsistently and incorrectly?
>>>>>>
>>>>>> Did you analyse the "sometimes" cases to see if they were safe?
>>>>>> (Aside: just for fun check out what happens if you lock the
>>>>>> Threads_lock with a safepoint check and a safepoint has been 
>>>>>> requested
>>>>>> :) ).
>>>>>>
>>>>>> Cheers,
>>>>>> David
>>>>>>
>>>>>> On 16/10/2014 4:04 AM, Max Ockner wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I am a new member of the Hotspot runtime team in Burlington, MA.
>>>>>>> Please review my first fix related to safepoint checking.
>>>>>>>
>>>>>>> Summary: MutexLockerEx can either acquire a lock with or without a
>>>>>>> safepoint check.
>>>>>>> In some cases, a particular lock must either safepoint check 
>>>>>>> always or
>>>>>>> never to avoid deadlocking.
>>>>>>> Some other locks have semantics which allow them to avoid deadlocks
>>>>>>> despite having a safepoint check only some of the time.
>>>>>>> All locks that are OK having inconsistent safepoint checks have 
>>>>>>> been
>>>>>>> marked. All locks that should never safepoint check and all 
>>>>>>> locks that
>>>>>>> should always safepoint check have also been marked.
>>>>>>> When a MutexLockerEx acquires a lock with or without a safepoint
>>>>>>> check,
>>>>>>> the lock's safepointAllowed marker is checked to ensure consistent
>>>>>>> safepoint checking.
>>>>>>>
>>>>>>> Webrev: http://oklahoma.us.oracle.com/~mockner/webrev/8047290/
>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8047290
>>>>>>>
>>>>>>> Tested with:
>>>>>>>      jprt "-testset hotspot"
>>>>>>>      jtreg hotspot
>>>>>>>      vm.quick.testlist
>>>>>>>
>>>>>>> Whitebox tests:
>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java: Test
>>>>>>> expects Assert ("This lock should always have a safepoint check")
>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java: Test
>>>>>>> expects Assert ("This lock should never have a safepoint check")
>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java: code
>>>>>>> should not assert. (Lock is properly acquired with no safepoint 
>>>>>>> check)
>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java: code
>>>>>>> should not assert. (Lock is properly acquired with safepoint check)
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Max
>>>>>>>
>>>>>
>>>
>



More information about the hotspot-dev mailing list