RFR: 8168914: Crash in ClassLoaderData/JNIHandleBlock::oops_do during concurrent marking
Erik Helin
erik.helin at oracle.com
Mon Feb 20 15:48:13 UTC 2017
On 02/16/2017 07:14 PM, coleen.phillimore at oracle.com wrote:
> On 2/16/17 9:13 AM, Erik Helin wrote:
>> On 02/15/2017 10:08 PM, coleen.phillimore at oracle.com wrote:
>>> Hi Erik,
>>
>> Hey Coleen, thanks for reviewing! Please see new webrevs at:
>> - incremental: http://cr.openjdk.java.net/~ehelin/8168914/01-02/
>> - full: http://cr.openjdk.java.net/~ehelin/8168914/02/
>>
>>> I thought you were going to hide the declaration of class
>>> ChunkedHandleList in classLoaderData.cpp since it is only used
>>> internally?
>>>
>>> And have a field in ClassLoaderData: ChunkedHandleList*
>>> _chunked_handle_list;
>>
>> I didn't want to have a pointer to ChunkedHandleList just because I
>> want to "hide" the implementation, I prefer to embed an instance (even
>> thought that means I have to declare the struct/class in the .hpp file).
>
> My preference is different but this is fine.
>>
>>> The class name "Chunk" is so overused!
>>
>> Haha, I agree that it is quite commonly used :) Fortunately there is
>> some context here: Chunk is an inner class/struct to
>> ChunkedHandleList. That hopefully gives the reader enough insight to
>> distinguish this Chunk from other Chunks in the VM.
>>
>>> Here ChunkedHandleList should not be a subtype of CHeapObj<mtClass>.
>>> It's a ValueObj.
>>
>> Agree, this is a left-over from an earlier state of the patch. Thanks
>> for noticing!
>>
>>> I don't think it should be a struct either because even though it's
>>> internal to ClassLoaderData, the fields could still be accessed outside
>>> the class.
>>
>> Agree, I cleaned that up a bit. Thanks.
>>
>>> Why is
>>>
>>> 629 void ClassLoaderData::remove_handle_unsafe(jobject h) {
>>>
>>>
>>> it called unsafe? It is safe to zero the element in the block, isn't
>>> it?
>>
>> Because code using remove_handle_unsafe should be cautious. It is only
>> safe to write NULL to the *((oop*) op) if:
>> - the jobject originates from a call to ClassLoaderData::add_handle
>> (the assert verifies this).
>> - *op is the sole remaining pointer to **op (the object).
>> - The native memory backing the chunked linked list is not freed.
>>
>> It is unfortunate that we even need to have the remove_handle_unsafe
>> at all. An alternative is to let the only code using
>> remove_handle_unsafe (ModuleEntry) do *((oop*) op) = NULL on its own.
>>
>> Would you prefer that?
>>
>
> I answered this in my last mail.
>
>>> One other reason to make it internal is that the CAPACITY of 64 is way
>>> too big for the ClassLoaderData of anonymous classes which I think will
>>> only have presently 1 (or 2 maybe) oops in it. And there are zillions
>>> of them.
>>
>> Oops, thanks for catching this. I intended to use the same capacity as
>> JNIHandleBlock, which uses 32 (don't recall where that 64 came from).
>> As for using a smaller size, that seems to be a separate patch? For
>> now I would prefer to keep the characteristics the same as for
>> JNIHandleBlock.
>
> If the whole ChunkHandleList class was internal, then you could pick 10
> or something. I guess we'll see if anyone notices footprint and fix it
> then. JNIHandleBlock was 32 so this won't increase footprint.
I don't follow, ChunkedHandleList is private inner class to
ClassLoaderData. Why is this less internal than having the declaration
in the .cpp file? That ChunkedHandleList is declared in the
classLoaderData.hpp shouldn't make it harder to change the field
CAPACITY to 10. Or am I missing something?
> Thanks - the new webrev looks good.
Thanks. I uploaded a new version (see new email).
Thanks,
Erik
> Coleen
>>
>>
>> Thanks,
>> Erik
>>
>>> Thanks,
>>> Coleen
>>>
>>>
>>> On 2/14/17 11:40 AM, Erik Helin wrote:
>>>> Hi all,
>>>>
>>>> this patch aims to solve the bug [0] by making it safe for a GC to
>>>> concurrently traverse the oops in a ClassLoaderData.
>>>>
>>>> The problem right now is that ClassLoaderData::_handles is a
>>>> JNIHandleBlock. It is not safe for one thread to iterate over the oops
>>>> in a JNIHandleBlock while another thread concurrently adds a new oop
>>>> to the JNIHandleBlock.
>>>>
>>>> My proposed solution is to replace JNIHandleBlock with another data
>>>> structure for ClassLoaderData. ClassLoaderData only needs a place to
>>>> store oops that a GC can iterate over, it has no use for the rest of
>>>> the methods in the JNIHandleBlock class. I have implemented a minimal
>>>> chunked linked list data structure (internal to ClassLoaderData) with
>>>> only two public methods:
>>>> - add (only executed by one thread at a time)
>>>> - oops_do (can be executed by any number of threads, possibly
>>>> concurrently with a call to `add`)
>>>>
>>>> ChunkedHandleList uses load_acquire and release_store to synchronize
>>>> access to the underlying chunks (arrays). Since ChunkedHandleList
>>>> utilizes the (rather) minimal requirements of its user
>>>> (ClassLoaderData), I decided to keep the class internal to
>>>> ClassLoaderData for now. If other find it useful elsewhere, the we can
>>>> try to create a more generalized version (this is trickier than it
>>>> appears at first sight, I tried ;)).
>>>>
>>>> I also changed ClassLoaderData::remove_handle to write NULL to the
>>>> oop* (that is represented by a jobject), instead of using a sentinel
>>>> oop as done by JNIHandleBlock. The GC closures should be prepared to
>>>> handle a field in a Java object becoming NULL, so this should be fine.
>>>>
>>>> Bug:
>>>> https://bugs.openjdk.java.net/browse/JDK-8168914
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~ehelin/8168914/00/
>>>>
>>>> Testing:
>>>> - Tier 1 (aka JPRT), 2, 3, 4, 5.
>>>>
>>>> I would appreciate quite a few reviews on this patch, it contains a
>>>> nice mixture of:
>>>> - me venturing into the runtime code :)
>>>> - lock free code
>>>> - unreproducible bug (even though I'm sure of the root cause)
>>>>
>>>> Thanks for everyone participating in the discussions around this bug
>>>> and potential solutions: Volker, Coleen, Mikael G, StefanK, Erik Ö and
>>>> Jiangli.
>>>>
>>>> Thanks!
>>>> Erik
>>>>
>>>> [0]: https://bugs.openjdk.java.net/browse/JDK-8168914
>>>
>
More information about the hotspot-dev
mailing list