RFR (XS) 8047212: fix race between ObjectMonitor alloc and verification code
David Holmes
david.holmes at oracle.com
Wed Oct 21 03:24:01 UTC 2015
On 21/10/2015 12:51 PM, Daniel D. Daugherty wrote:
> Thanks for the quick review!
>
>
> On 10/20/15, 8:34 PM, David Holmes wrote:
>> On 21/10/2015 10:43 AM, serguei.spitsyn at oracle.com wrote:
>>> Dan,
>>>
>>> Nice finding!
>>>
>>> The fixes look good to me.
>>> The changes in the oops_do() and deflate_idle_monitors() are not
>>> necessary
>>> as these functions are called in VM_Operation's.
>>> But some unification with OrderAccess::load_ptr_acquire() does not harm
>>> either.
>>
>> I was thinking on the same lines: that reading the variable at a
>> safepoint should already be safe, but that consistently using
>> load_acquire_ptr is harmless and stylistically a 'Good Thing".
>
> Good. We're on the same page here...
>
>> The other thought I had was perhaps making the verification routines
>> (which are non-product) use the lock as well. :)
>
> If the verification routine had used the lock, then we wouldn't
> have spotted this unsafe use of the lock-free algorithm in this
> context. The lock-free calls to ObjectSynchronizer::monitors_iterate()
> would simply miss everything other than the new ObjectMonitor block
> on occasion; not an easy thing to diagnose unless you are looking
> for something specific on the list that you know should be there...
AFAICS monitors_iterate() is either called at a safepoint or else with
the lock held. Exactly which call to monitors_iterate exposed this problem?
>
>> Aside: did you check that all the other global monitor variables are
>> accessed correctly - ie either always with the lock held or using
>> OrderAccess operations if lock-free?
>
> Nothing more than a cursory look. I have a deflate_idle_monitors bug
> on my plate also so I'll be looking at this code even more...
Ok.
Thanks,
David
>
>> I have still have the query about the vmStructs changes but otherwise
>> thumbs up.
>
> My reply to that query crossed in the ether with this review...
>
> Dan
>
>
>>
>> Thanks,
>> David
>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 10/20/15 14:15, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I've updated the fix based on feedback from Carsten V and David H.
>>>>
>>>> Webrev URL:
>>>> http://cr.openjdk.java.net/~dcubed/8047212-webrev/1-jdk9-hs-rt/
>>>>
>>>> Changes relative to round 0:
>>>>
>>>> - only src/share/vm/runtime/synchronizer.cpp has changed
>>>> - reads of gBlockList now use OrderAccess::load_ptr_acquire()
>>>>
>>>> code style cleanups:
>>>>
>>>> - only cleaned up the functions that I touched to make the
>>>> OrderAccess::load_ptr_acquire() changes
>>>> - changed implied booleans into real boolean expressions
>>>> - moved some locals to narrower context
>>>> - added/removed some blank lines
>>>> - made casts consistent with the majority style in this file
>>>>
>>>> I'm repeating all of the same testing that I did for round 0. The
>>>> round 1 bits have not yet made it through JPRT-west, but the jobs
>>>> are mostly done.
>>>>
>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>
>>>> Dan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 10/19/15, 9:02 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I have a fix for a long standing race between the lock-free
>>>>> ObjectMonitor
>>>>> verification code and the normal (locked) ObjectMonitor block
>>>>> allocation
>>>>> code path. For this fix, I would like at least a Runtime team reviewer
>>>>> and a Serviceability team reviewer. Thanks!
>>>>>
>>>>> JDK-8047212
>>>>> runtime/ParallelClassLoading/bootstrap/random/inner-complex
>>>>> assert(ObjectSynchronizer::verify_objmon_isinpool(inf)) failed:
>>>>> monitor is invalid
>>>>> https://bugs.openjdk.java.net/browse/JDK-8047212
>>>>>
>>>>> Webrev URL:
>>>>> http://cr.openjdk.java.net/~dcubed/8047212-webrev/0-jdk9-hs-rt/
>>>>>
>>>>> Testing: Aurora Adhoc RT-SVC nightly batch
>>>>> 4 inner-complex fastdebug parallel runs for 4+ days and
>>>>> 600K iterations without seeing this failure; the experiment
>>>>> is still running; final results to be reported at the end
>>>>> of the review cycle
>>>>> JPRT -testset hotspot
>>>>>
>>>>> This fix:
>>>>>
>>>>> - makes ObjectMonitor::gBlockList volatile
>>>>> - uses "OrderAccess::release_store_ptr(&gBlockList, temp)" to
>>>>> make sure the new block updates _happen before_ gBlockList is
>>>>> changed to refer to the new block
>>>>> - add SA support for a "static pointer volatile" field like:
>>>>>
>>>>> static ObjectMonitor * volatile gBlockList;
>>>>>
>>>>> See the following link for a nice description of what "volatile"
>>>>> means in the different positions on a variable/parameter decl line:
>>>>>
>>>>> http://www.embedded.com/electronics-blogs/beginner-s-corner/4023801/Introduction-to-the-Volatile-Keyword
>>>>>
>>>>>
>>>>>
>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>
More information about the hotspot-runtime-dev
mailing list