[Fwd: Re: Request for Review (XS): 6704010: Internal Error(src/share/vm/interpreter/interpreterRuntime.cpp:1106)]
Dmitry Samersoff
Dmitry.Samersoff at oracle.com
Wed Dec 1 22:44:52 PST 2010
Coleen,
May be:
#ifdef ASSERT
...
{
MutexLocker mu(SignatureHandlerLibrary_lock);
a = _handlers->find(method->signature_handler());
b = _fingerprints->find(Fingerprinter(method).fingerprint());
}
assert(
method->signature_handler() == Interpreter::slow_signature_handler() ||
a == b, "sanity check");
#endif
-Dmitry
On 2010-12-02 06:47, Coleen Phillimore wrote:
> On 12/1/2010 9:46 PM, David Holmes wrote:
>> Moving the assertion into a locked region may potentially cause a
>> deadlock if the assertion fails and we go to fatal error handling.
>> We'd need to check that the lock in question is not held in a nested
>> fashion with another lock that might be needed in the fatal error path.
>>
>> That said there are already potential deadlocks in fatal error
>> handling with some debug/tracing options, so maybe that's a better
>> trade-off than getting the false failure.
> Good point. You could move it so you lock while getting the boolean
> condition and then asserting out of the lock scope.
>
> Coleen (still happy this bug was diagnosed)
>>
>> Just a thought.
>>
>> David Holmes
>>
>> Coleen Phillimore said the following on 12/02/10 09:08:
>>> Volker,
>>> This is great. This was on my bug list. It looks good to me and I'll
>>> be happy to push it for you.
>>> Thanks!
>>> Coleen
>>>
>>> Vladimir Kozlov wrote:
>>>> Sending to runtime alias since it is RT bug.
>>>>
>>>> -------- Original Message --------
>>>> Subject: Re: Request for Review (XS): 6704010: Internal Error
>>>> (src/share/vm/interpreter/interpreterRuntime.cpp:1106)
>>>> Date: Wed, 1 Dec 2010 19:44:40 +0100
>>>> From: Volker Simonis <volker.simonis at gmail.com>
>>>> To: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
>>>>
>>>> http://www.progdoc.de/webrev/6704010/
>>>>
>>>> This problem occurs very rarely and it is therefore hardly possible to
>>>> reliably reproduce it. However after looking at it for a while I think
>>>> I found the cause of the problem:
>>>>
>>>> The SignatureHandlerLibrary class uses two static GrowableArrays,
>>>> '_handlers' and '_fingerprints', to store method fingerprints along
>>>> with their corresponding signature handlers. But GrowableArrays are
>>>> are NOT synchronized in any way if accessed from multiple threads. To
>>>> avoid races it is therefore necessary to synchronize different threads
>>>> which access the same GrowableArray. This is done for most of the code
>>>> in 'SignatureHandlerLibrary::add()' which is protected by 'MutexLocker
>>>> mu(SignatureHandlerLibrary_lock)'.
>>>>
>>>> However the assertion at the end of the method is outside the scope of
>>>> this MutexLocker which can lead to an illegal view on the
>>>> corresponding GrowableArray data structures. Here'S what may happen in
>>>> detail:
>>>>
>>>> -thread one enters the section protected by the MutexLocker and adds
>>>> another fingerprint/handler to the growable arrays by calling
>>>> 'GrowableArray::append()'. This can lead to an implicit call to
>>>> 'GrowableArray::grow()' if the arrays size exhausted. In the 'grow()'
>>>> method a new data array will be allocated, the contents of the old
>>>> array will be copied into the new one and then (1) the storage
>>>> occupied by the old array will be freed and (2) the address of the new
>>>> storage will be assigned to the internal data pointer. Between (1) and
>>>> (2) there's a small (like once every six month or so:) chance that the
>>>> thread will be unscheduled and another thread is allocating the memory
>>>> which has just been freed.
>>>>
>>>> - thread two is just in the assertion code and trying to query the
>>>> GrowableArrays '_handlers' and '_fingerprints' by calling
>>>> 'GrowableArray::find()'. If this happens right in the interval
>>>> between the points (1) and (2) described above, the 'find()' method
>>>> may return incorrect results which can spuriously trigger the
>>>> assertion.
>>>>
>>>> Fixing this can be easily done by enclosing the assertion into a
>>>> region which is protected by the same Mutex like the main body of
>>>> 'SignatureHandlerLibrary::add()'.
>>>>
>>>> As I wrote, I couldn't reproduce this in the HostSpot, so no
>>>> regression test. I was however able to reproduce the described
>>>> scenario with a standalone GrowableArray class and a small test
>>>> program consisting of one writer thread which appends a constant value
>>>> to a GrowableArray and ten reader threads which are constantly trying
>>>> to find that special value in the array on a dual core x86_64 Linux
>>>> box.
>>>>
>>>> As a last side note, the problem could also be made even more
>>>> unlikely, by changing the implementation of 'GrowableArray::grow()'
>>>> from:
>>>>
>>>> template<class E> void GrowableArray<E>::grow(int j) {
>>>> int old_max = _max;
>>>> ...
>>>> if (on_C_heap() && _data != NULL) {
>>>> FreeHeap(_data);
>>>> }
>>>> _data = newData;
>>>> }
>>>>
>>>> to:
>>>>
>>>> template<class E> void GrowableArray<E>::grow(int j) {
>>>> int old_max = _max;
>>>> ...
>>>> _old_data = _data;
>>>> _data = newData;
>>>> if (on_C_heap() && _old_data != NULL) {
>>>> FreeHeap(_old_data);
>>>> }
>>>> }
>>>>
>>>> On the other hand that's still not 100% safe ('_data' is not volatile,
>>>> other memory models, ..) and I think GrowableArray is definitely not
>>>> intended for parallel access and therefore such a change may
>>>> unnecessarily hide other problems where this precondition is violated.
>>>> So maybe nulling out the '_data' array before freeing it may be a good
>>>> idea in the debug build to catch such errors!
>>>>
>>>> Regards,
>>>> Volker
>
--
Dmitry Samersoff
J2SE Sustaining team, SPB04
* Give Rabbit time and he'll always get the answer ...
More information about the hotspot-runtime-dev
mailing list