RFR (XS) 8047212: fix race between ObjectMonitor alloc and verification code

serguei.spitsyn at oracle.com serguei.spitsyn at oracle.com
Wed Oct 21 05:14:37 UTC 2015


On 10/20/15 20:33, David Holmes wrote:
> On 21/10/2015 1:30 PM, Daniel D. Daugherty wrote:
>> On 10/20/15, 9:24 PM, David Holmes wrote:
>>> On 21/10/2015 12:51 PM, Daniel D. Daugherty wrote:
>>>> Thanks for the quick review!
>>>>
>>>>
>>>> On 10/20/15, 8:34 PM, David Holmes wrote:
>>>>> On 21/10/2015 10:43 AM, serguei.spitsyn at oracle.com wrote:
>>>>>> Dan,
>>>>>>
>>>>>> Nice finding!
>>>>>>
>>>>>> The fixes look good to me.
>>>>>> The changes in the oops_do() and deflate_idle_monitors() are not
>>>>>> necessary
>>>>>> as these functions are called in VM_Operation's.
>>>>>> But some unification with OrderAccess::load_ptr_acquire() does not
>>>>>> harm
>>>>>> either.
>>>>>
>>>>> I was thinking on the same lines: that reading the variable at a
>>>>> safepoint should already be safe, but that consistently using
>>>>> load_acquire_ptr is harmless and stylistically a 'Good Thing".
>>>>
>>>> Good. We're on the same page here...
>>>>
>>>>> The other thought I had was perhaps making the verification routines
>>>>> (which are non-product) use the lock as well. :)
>>>>
>>>> If the verification routine had used the lock, then we wouldn't
>>>> have spotted this unsafe use of the lock-free algorithm in this
>>>> context. The lock-free calls to ObjectSynchronizer::monitors_iterate()
>>>> would simply miss everything other than the new ObjectMonitor block
>>>> on occasion; not an easy thing to diagnose unless you are looking
>>>> for something specific on the list that you know should be there...
>>>
>>> AFAICS monitors_iterate() is either called at a safepoint or else with
>>> the lock held. Exactly which call to monitors_iterate exposed this
>>> problem?
>>
>> I could swear I ran across a monitors_iterate() call from
>> JVM/TI that didn't go to a safepoint because the target
>> thread was suspended... I have a vague memory that JVM/TI
>> uses a safepoint-op when the target thread is not suspended
>> and goes direct when the target thread is suspended...
>
> My bad - you are right, it skips the safepoint when the target is 
> suspended.

It is even more interesting. :)

The monitors_iterate() is called in the JvmtiEnvBase::get_owned_monitors().
The JvmtiEnvBase::get_owned_monitors() can be called directly from the
JvmtiEnv::GetOwnedMonitorInfo() or JvmtiEnv::GetOwnedMonitorStackDepthInfo()
if the calling thread is current thread.
Otherwise, it is called at a safepoint in the VM_GetOwnedMonitorInfo 
operation.
The calling thread being current thread is considered as a fully 
suspended thread.

Thanks,
Serguei

>
> David
>
>> Dan
>>
>>
>>>
>>>>
>>>>> Aside: did you check that all the other global monitor variables are
>>>>> accessed correctly - ie either always with the lock held or using
>>>>> OrderAccess operations if lock-free?
>>>>
>>>> Nothing more than a cursory look. I have a deflate_idle_monitors bug
>>>> on my plate also so I'll be looking at this code even more...
>>>
>>> Ok.
>>>
>>> Thanks,
>>> David
>>>
>>>>
>>>>> I have still have the query about the vmStructs changes but otherwise
>>>>> thumbs up.
>>>>
>>>> My reply to that query crossed in the ether with this review...
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 10/20/15 14:15, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I've updated the fix based on feedback from Carsten V and David H.
>>>>>>>
>>>>>>> Webrev URL:
>>>>>>> http://cr.openjdk.java.net/~dcubed/8047212-webrev/1-jdk9-hs-rt/
>>>>>>>
>>>>>>> Changes relative to round 0:
>>>>>>>
>>>>>>> - only src/share/vm/runtime/synchronizer.cpp has changed
>>>>>>> - reads of gBlockList now use OrderAccess::load_ptr_acquire()
>>>>>>>
>>>>>>> code style cleanups:
>>>>>>>
>>>>>>> - only cleaned up the functions that I touched to make the
>>>>>>>   OrderAccess::load_ptr_acquire() changes
>>>>>>> - changed implied booleans into real boolean expressions
>>>>>>> - moved some locals to narrower context
>>>>>>> - added/removed some blank lines
>>>>>>> - made casts consistent with the majority style in this file
>>>>>>>
>>>>>>> I'm repeating all of the same testing that I did for round 0. The
>>>>>>> round 1 bits have not yet made it through JPRT-west, but the jobs
>>>>>>> are mostly done.
>>>>>>>
>>>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 10/19/15, 9:02 PM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> I have a fix for a long standing race between the lock-free
>>>>>>>> ObjectMonitor
>>>>>>>> verification code and the normal (locked) ObjectMonitor block
>>>>>>>> allocation
>>>>>>>> code path. For this fix, I would like at least a Runtime team
>>>>>>>> reviewer
>>>>>>>> and a Serviceability team reviewer. Thanks!
>>>>>>>>
>>>>>>>> JDK-8047212
>>>>>>>> runtime/ParallelClassLoading/bootstrap/random/inner-complex
>>>>>>>> assert(ObjectSynchronizer::verify_objmon_isinpool(inf)) failed:
>>>>>>>>             monitor is invalid
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8047212
>>>>>>>>
>>>>>>>> Webrev URL:
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8047212-webrev/0-jdk9-hs-rt/
>>>>>>>>
>>>>>>>> Testing: Aurora Adhoc RT-SVC nightly batch
>>>>>>>>          4 inner-complex fastdebug parallel runs for 4+ days and
>>>>>>>>            600K iterations without seeing this failure; the
>>>>>>>> experiment
>>>>>>>>            is still running; final results to be reported at 
>>>>>>>> the end
>>>>>>>>            of the review cycle
>>>>>>>>          JPRT -testset hotspot
>>>>>>>>
>>>>>>>> This fix:
>>>>>>>>
>>>>>>>> - makes ObjectMonitor::gBlockList volatile
>>>>>>>> - uses "OrderAccess::release_store_ptr(&gBlockList, temp)" to
>>>>>>>>   make sure the new block updates _happen before_ gBlockList is
>>>>>>>>   changed to refer to the new block
>>>>>>>> - add SA support for a "static pointer volatile" field like:
>>>>>>>>
>>>>>>>>     static ObjectMonitor * volatile gBlockList;
>>>>>>>>
>>>>>>>> See the following link for a nice description of what "volatile"
>>>>>>>> means in the different positions on a variable/parameter decl 
>>>>>>>> line:
>>>>>>>>
>>>>>>>> http://www.embedded.com/electronics-blogs/beginner-s-corner/4023801/Introduction-to-the-Volatile-Keyword 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20151020/b3c47b78/attachment.html>


More information about the serviceability-dev mailing list