RFR: 8168914: Crash in ClassLoaderData/JNIHandleBlock::oops_do during concurrent marking

David Holmes david.holmes at oracle.com
Wed Feb 15 01:48:56 UTC 2017


Hi Erik,

On 15/02/2017 2:40 AM, Erik Helin wrote:
> Hi all,
>
> this patch aims to solve the bug [0] by making it safe for a GC to
> concurrently traverse the oops in a ClassLoaderData.
>
> The problem right now is that ClassLoaderData::_handles is a
> JNIHandleBlock. It is not safe for one thread to iterate over the oops
> in a JNIHandleBlock while another thread concurrently adds a new oop to
> the JNIHandleBlock.
>
> My proposed solution is to replace JNIHandleBlock with another data
> structure for ClassLoaderData. ClassLoaderData only needs a place to
> store oops that a GC can iterate over, it has no use for the rest of the
> methods in the JNIHandleBlock class. I have implemented a minimal
> chunked linked list data structure (internal to ClassLoaderData) with
> only two public methods:
> - add (only executed by one thread at a time)
> - oops_do (can be executed by any number of threads, possibly
>            concurrently with a call to `add`)
>
> ChunkedHandleList uses load_acquire and release_store to synchronize
> access to the underlying chunks (arrays). Since ChunkedHandleList
> utilizes the (rather) minimal requirements of its user
> (ClassLoaderData), I decided to keep the class internal to
> ClassLoaderData for now. If other find it useful elsewhere, the we can
> try to create a more generalized version (this is trickier than it
> appears at first sight, I tried ;)).
>
> I also changed ClassLoaderData::remove_handle to write NULL to the oop*
> (that is represented by a jobject), instead of using a sentinel oop as
> done by JNIHandleBlock. The GC closures should be prepared to handle a
> field in a Java object becoming NULL, so this should be fine.
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8168914
>
> Webrev:
> http://cr.openjdk.java.net/~ehelin/8168914/00/

oops_do can miss an oop that is in the process of being added - is that 
a problem?

  140       if (c->data[i] != NULL) {
  141         f->do_oop(&c->data[i]);
  142       }

Given a slot can be nulled out concurrently at any time, is it worth 
does this NULL check? The OopClosure will have to do its own NULL check 
anyway.

  144     c = (Chunk*) OrderAccess::load_ptr_acquire((volatile 
intptr_t*)&c->next);

This doesn't need to be a load-acquire. You already loaded 'c' via 
load-acquire of _head (or chained therefrom) and its next field is set 
prior to the setting of the _head that you read.

  624 jobject ClassLoaderData::add_handle(Handle h) {
  625   MutexLockerEx ml(metaspace_lock(), 
Mutex::_no_safepoint_check_flag);
  626   return (jobject) _handles.add(h());
  627 }
  628
  629 void ClassLoaderData::remove_handle_unsafe(jobject h) {
  630   assert(_handles.contains((oop*) h), "Got unexpected handle " 
PTR_FORMAT, p2i((oop*) h));
  631   *((oop*) h) = NULL;
  632 }

I'm a bit unclear on the typing here. Previously we use a JNI Handle 
which was a jobject. But now we've simply stored an oop into a slot in 
an array. We pass the address of that slot back as a jobject, but is it 
really? I would also expect contains to take an oop not an oop* - it 
seems to be asking "is this an address of a slot in our 
ChunkedHandleList" rather than asking "is this oop in our 
ChunkedHandleList". ??

A few minor comments:

Copyrights need updating.

src/share/vm/classfile/classLoaderData.hpp

177     // Only on thread ...

Typo: on -> one

---

src/share/vm/classfile/moduleEntry.cpp

90   // Create a JNI handle for the shared ProtectionDomain and save it 
atomically.
  91   // If someone beats us setting the _pd cache, the created JNI 
handle is destroyed.

These are no longer JNI Handles.

Thanks,
David
-----

> Testing:
> - Tier 1 (aka JPRT), 2, 3, 4, 5.
>
> I would appreciate quite a few reviews on this patch, it contains a nice
> mixture of:
> - me venturing into the runtime code :)
> - lock free code
> - unreproducible bug (even though I'm sure of the root cause)
>
> Thanks for everyone participating in the discussions around this bug and
> potential solutions: Volker, Coleen, Mikael G, StefanK, Erik Ö and Jiangli.
>
> Thanks!
> Erik
>
> [0]: https://bugs.openjdk.java.net/browse/JDK-8168914


More information about the hotspot-dev mailing list