RFR:8047290:Ensure consistent safepoint checking in MutexLockerEx

Coleen Phillimore coleen.phillimore at oracle.com
Wed Dec 10 21:00:30 UTC 2014


Hi Max,

You have addressed all of my issues with this change, so it's reviewed 
by me also.

Coleen

On 12/10/14, 11:56 AM, Max Ockner wrote:
> New webrev: http://cr.openjdk.java.net/~coleenp/8047290absolutely_final/
>
> I have addressed dan's suggestions. I also removed unnecessary 
> "this->" occurrences from the assert statements.
>
> Though I realize that I have unnecessarily duplicated code in my 
> tests, I do not want to combine the tests into one right now because 
> (1) They work as they are, (2) They can fail independently, (3) The 
> amount of code needed to run all four tests in one file without 
> crashing out after the first failure is not as small as you might 
> think, and (4) I want to commit this before someone else messes with 
> the locks to avoid more merge conflicts. To be clear, I am not opposed 
> to fixing this separately if people think it is important, but I 
> prefer to put it off until the bulk of the fix is committed.
>
> Does this look ready?
>
> Thanks for your help,
> Max O
>
>
>
> On 12/8/2014 1:39 AM, David Holmes wrote:
>> Hi Max,
>>
>> Just a nit with the tests - it seems to me we should be able to write 
>> a single Java class that can process all four combinations rather 
>> than duplicating the bulk of the code.
>>
>> Thanks,
>> David
>>
>> On 5/12/2014 7:50 AM, Max Ockner wrote:
>>> Hello once again...
>>> I have a new (and suggestively titled) webrev:
>>> http://cr.openjdk.java.net/~coleenp/8047290final/
>>>
>>> Ran aurora tests.
>>>
>>> Here is a list of "sometimes" locks:
>>>
>>> "WorkGroup monitor" share/vm/utilities/workgroup.cpp
>>> "SLTMonitor" share/vm/gc_implementation/shared/concurrentGCThread.cpp
>>> "CompactibleFreeListSpace._lock"
>>> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp 
>>>
>>> "freelist par lock"
>>> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp 
>>>
>>> "SR_lock" share/vm/runtime/thread.cpp
>>> "CMS_markBitMap_lock"
>>> share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp 
>>>
>>>
>>>
>>> The remaining "sometimes" locks can be found in
>>> share/vm/runtime/mutexLocker.cpp:
>>>
>>> ParGCRareEvent_lock
>>> Safepoint_lock
>>> Threads_lock
>>> VMOperationQueue_lock
>>> VMOperationRequest_lock
>>> Terminator_lock
>>> Heap_lock
>>> Compile_lock
>>> PeriodicTask_lock
>>> JfrStacktrace_lock
>>>
>>>
>>> Thanks,
>>> Max Ockner
>>>
>>>
>>>
>>>
>>> On 11/25/2014 3:03 AM, David Holmes wrote:
>>>> Hi Max,
>>>>
>>>> On 21/11/2014 7:56 AM, Max Ockner wrote:
>>>>> Hello again,
>>>>>
>>>>> I have made changes based on all comments. There is now a pair of 
>>>>> assert
>>>>> statements in the Monitor/Mutex wait() methods. When I reran tests, I
>>>>
>>>> Is there an updated webrev?
>>>>
>>>>> caught another lock that I had to change to "sometimes", but the 
>>>>> assert
>>>>> that caught this lock was not in wait. There are currently no 
>>>>> locks that
>>>>> use try to pass an incorrect safepoint check argument to wait().
>>>>> Instead, gcbasher did not catch this lock last time, when the only
>>>>> asserts were in lock and lock_without_safepoint. This lock is
>>>>> "CMS_markBitMap_lock" in
>>>>> share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp. 
>>>>>
>>>>>
>>>>> I'm guessing that it was not caught by the tests because this 
>>>>> section of
>>>>> code is not reached very often. My initial inspection failed to catch
>>>>> this lock because it is passed around between various structures and
>>>>> renamed 4 times. I have not yet found a good way to check for this
>>>>> situation, even with a debugger.
>>>>>
>>>>> Are there any tests which actually manage to hit every line of code?
>>>>
>>>> No. There is too much code that is dependent on low-level details of
>>>> the GC used, the compilation strategy, plus the set of runtime flags
>>>> used (and whether product or fastdebug). That's why we have lots of
>>>> tests run in lots of different ways, to try to get coverage.
>>>>
>>>>> How should I handle this situation where I can't rely on the tests 
>>>>> that
>>>>> I have run to tell me if I have missed something?
>>>>> At what point can I assume that everything is OK?
>>>>
>>>> Difficult to answer in general - there are a number of recommended
>>>> test suites used by the runtime team, but your changes will also
>>>> impact GC and compiler code and so may not get exercised by the
>>>> runtime test suites (unless run with various compiler and GC options).
>>>> Perhaps an ad-hoc test run similar to nightlies? Or you push after the
>>>> weekly snapshot so as to maximise nightly testing, and if there are
>>>> issues exposed then you have time to address them or revert the fix.
>>>>
>>>> Cheers,
>>>> David
>>>>
>>>>> Thanks,
>>>>> Max Ockner
>>>>>
>>>>> On 11/10/2014 11:57 PM, David Holmes wrote:
>>>>>> Hi Max,
>>>>>>
>>>>>> On 8/11/2014 2:15 AM, Max Ockner wrote:
>>>>>>> Hello all,
>>>>>>> I have made these additonal changes:
>>>>>>> -Moved the assert() statements into the lock and
>>>>>>> lock_without_safepoint
>>>>>>> methods.
>>>>>>> -Changed Monitor::SafepointAllowed to 
>>>>>>> Monitor::SafepointCheckRequired
>>>>>>> -Changed the Monitor::SafepointCheckRequired values for several 
>>>>>>> locks
>>>>>>> which were locked outside of a MutexLockerEx (some were locked with
>>>>>>> MutexLocker, some were locked were locked without any 
>>>>>>> MutexLocker* )
>>>>>>>
>>>>>>> New webrev location: http://cr.openjdk.java.net/~coleenp/8047290/
>>>>>>
>>>>>> Generally this is all okay - a few style and other nits below.
>>>>>>
>>>>>> However you missed adding an assert in Monitor::wait to check if the
>>>>>> no_safepoint_check flag was being used correctly for the current
>>>>>> monitor.
>>>>>>
>>>>>> Specific comments:
>>>>>>
>>>>>> src/share/vm/runtime/mutex.hpp
>>>>>>
>>>>>> This comment is no longer accurate with the moved check location:
>>>>>>
>>>>>> + // MutexLockerEx checks these flags when acquiring a lock
>>>>>> + // to ensure consistent checking for each lock.
>>>>>>
>>>>>> The same goes for other references to MutexLockerEx in the enum
>>>>>> description.
>>>>>>
>>>>>> Also copyright year needs updating.
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/share/vm/runtime/mutex.cpp
>>>>>>
>>>>>>  898   //Ensure
>>>>>>  961   //Ensure
>>>>>>
>>>>>> Space needed after //
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/share/vm/runtime/mutexLocker.cpp
>>>>>>
>>>>>> +  var = new type(Mutex::pri, #var,
>>>>>> vm_block,safepoint_check_allowed); \
>>>>>>
>>>>>> space needed after comma in k,s
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> src/share/vm/runtime/mutexLocker.hpp
>>>>>>
>>>>>> Whitespace only changes - looks like leftovers from removed edits.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>
>>>>>>> Additional testing:
>>>>>>> jtreg ./jdk/test/java/lang/invoke
>>>>>>> jtreg jfr tests
>>>>>>>
>>>>>>> Here is a list of ALL of the "sometimes" locks:
>>>>>>>
>>>>>>> "WorkGroup monitor" share/vm/utilities/workgroup.cpp
>>>>>>> "SLTMonitor" 
>>>>>>> share/vm/gc_implementation/shared/concurrentGCThread.cpp
>>>>>>> "CompactibleFreeListSpace._lock"
>>>>>>> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp 
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> "freelist par lock"
>>>>>>> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp 
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> "SR_lock" share/vm/runtime/thread.cpp
>>>>>>>
>>>>>>> The remaining "sometimes" locks can be found in
>>>>>>> share/vm/runtime/mutexLocker.cpp:
>>>>>>>
>>>>>>> ParGCRareEvent_lock
>>>>>>> Safepoint_lock
>>>>>>> Threads_lock
>>>>>>> VMOperationQueue_lock
>>>>>>> VMOperationRequest_lock
>>>>>>> Terminator_lock
>>>>>>> Heap_lock
>>>>>>> Compile_lock
>>>>>>> PeriodicTask_lock
>>>>>>> JfrStacktrace_lock
>>>>>>>
>>>>>>> I have not checked the validity of the "sometimes" locks, and I
>>>>>>> believe
>>>>>>> that this should be a different project.
>>>>>>>
>>>>>>> Thanks for your help!
>>>>>>> Max Ockner
>>>>>>> On 10/15/2014 8:54 PM, David Holmes wrote:
>>>>>>>> Hi Max,
>>>>>>>>
>>>>>>>> This is looking good.
>>>>>>>>
>>>>>>>> A few high-level initial comments:
>>>>>>>>
>>>>>>>> I think SafepointAllowed should be SafepointCheckNeeded
>>>>>>>>
>>>>>>>> Why are the checks in MutexLocker when the state is maintained 
>>>>>>>> in the
>>>>>>>> mutex itself and the mutex/monitor has lock_without_safepoint, and
>>>>>>>> wait(<safepoint checking flag>) ? I would have expected to see the
>>>>>>>> check in the mutex/monitor methods.
>>>>>>>>
>>>>>>>> Checking consistent usage of the _no_safepoint_check_flag is good.
>>>>>>>> But
>>>>>>>> another part of this is that a monitor/mutex that never checks for
>>>>>>>> safepoints should never be held when a thread blocks at a 
>>>>>>>> safepoint -
>>>>>>>> is there some way to easily check that? I was surprised how many
>>>>>>>> locks
>>>>>>>> are actually not checking for safepoints.
>>>>>>>>
>>>>>>>> Did you find any cases where the mutex/monitor was being used
>>>>>>>> inconsistently and incorrectly?
>>>>>>>>
>>>>>>>> Did you analyse the "sometimes" cases to see if they were safe?
>>>>>>>> (Aside: just for fun check out what happens if you lock the
>>>>>>>> Threads_lock with a safepoint check and a safepoint has been
>>>>>>>> requested
>>>>>>>> :) ).
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> David
>>>>>>>>
>>>>>>>> On 16/10/2014 4:04 AM, Max Ockner wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> I am a new member of the Hotspot runtime team in Burlington, MA.
>>>>>>>>> Please review my first fix related to safepoint checking.
>>>>>>>>>
>>>>>>>>> Summary: MutexLockerEx can either acquire a lock with or 
>>>>>>>>> without a
>>>>>>>>> safepoint check.
>>>>>>>>> In some cases, a particular lock must either safepoint check
>>>>>>>>> always or
>>>>>>>>> never to avoid deadlocking.
>>>>>>>>> Some other locks have semantics which allow them to avoid 
>>>>>>>>> deadlocks
>>>>>>>>> despite having a safepoint check only some of the time.
>>>>>>>>> All locks that are OK having inconsistent safepoint checks 
>>>>>>>>> have been
>>>>>>>>> marked. All locks that should never safepoint check and all locks
>>>>>>>>> that
>>>>>>>>> should always safepoint check have also been marked.
>>>>>>>>> When a MutexLockerEx acquires a lock with or without a safepoint
>>>>>>>>> check,
>>>>>>>>> the lock's safepointAllowed marker is checked to ensure 
>>>>>>>>> consistent
>>>>>>>>> safepoint checking.
>>>>>>>>>
>>>>>>>>> Webrev: http://oklahoma.us.oracle.com/~mockner/webrev/8047290/
>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8047290
>>>>>>>>>
>>>>>>>>> Tested with:
>>>>>>>>>      jprt "-testset hotspot"
>>>>>>>>>      jtreg hotspot
>>>>>>>>>      vm.quick.testlist
>>>>>>>>>
>>>>>>>>> Whitebox tests:
>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java: 
>>>>>>>>> Test
>>>>>>>>> expects Assert ("This lock should always have a safepoint check")
>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java: 
>>>>>>>>> Test
>>>>>>>>> expects Assert ("This lock should never have a safepoint check")
>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java: 
>>>>>>>>> code
>>>>>>>>> should not assert. (Lock is properly acquired with no safepoint
>>>>>>>>> check)
>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java: 
>>>>>>>>> code
>>>>>>>>> should not assert. (Lock is properly acquired with safepoint 
>>>>>>>>> check)
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Max
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>



More information about the hotspot-dev mailing list