RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

Coleen Phillimore coleen.phillimore at oracle.com
Mon Jun 27 13:01:39 UTC 2016


http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.00/src/share/vm/memory/universe.cpp.udiff.html

Do you need load_aquires and store_releases on 
set_reference_pending_list and has_reference_pending_list ?

Or rather, why do you need an atomic operation in swap_pending_list if 
the Heap_lock is held?

Aha, you answered the question in the header file.

http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.00/src/share/vm/oops/method.cpp.udiff.html

I think this particular special case hack could have been taken out by 
permgen elimination, but I was too nervous about it!

http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.00/src/share/vm/prims/jvm.cpp.udiff.html

Can you change the boolean "wait" to "should_wait" ?

http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.00/src/share/vm/runtime/thread.cpp.udiff.html

This change in initialization order makes sense to me.  We really need 
to preallocate OOM before initializing the system libraries. Since 
Throwable() doesn't check a property now, this is safe.  I don't think 
this wasn't safe a few months ago.  Throwable constructor should do as 
little as possible.

Thank you for removing the stale comment.

Is the test that you're going to add from the other bug fix? ie. 
OomDebugTest.java (can that have a @bug 8156500 entry in it?

So many special cases removed!  This is so nice.

http://cr.openjdk.java.net/~kbarrett/8156500/jdk.00/src/java.base/share/classes/java/lang/ref/Reference.java.udiff.html

Can you add to the popReferencePendingLIst comment that the pending list 
is managed by the GC, ie. gc enqueues and dequeues references to 
process?  Or something like that?

I'm glad you went with the simpler implementation of only popping one 
reference.  I think it can be enhanced in the future to get the whole 
list but it's late in the release to change the behavior and you might 
come up with something even better in the next release. I've learned so 
much about reference processing with this change. Thank you for making 
it make more sense.

This is a really good change.

Coleen


On 6/24/16 4:09 PM, Kim Barrett wrote:
> Please review this change which moves the Reference pending list and
> locking from the java.lang.ref.Reference class into the VM.  This
> includes elimination of the ReferencePendingListLocker mechanism in
> the VM. This fixes various fragility issues arising from having the
> management of this list split between Java and the VM (GC).
>
> This change addresses JDK-8156500 by eliminating the possibility of
> suspension while holding the lock. Because the locking is now done in
> the VM, we have full control over where suspension may occur.
>
> This change retains the existing interface between the reference
> processor and the nio.Bits package, e.g. Reference.tryHandlePending
> has the same signature and behavior; this change just pushes the
> pending list manipulation down into the VM.  There are open bugs
> reports, enhancement requests, and discussions related to that
> interface; see below. This change does not attempt to address them.
>
> This change additionally addresses or renders moot
> https://bugs.openjdk.java.net/browse/JDK-8055232
> (ref) Exceptions while processing Reference pending list
>
> and
> https://bugs.openjdk.java.net/browse/JDK-7103238
> Ensure pending list lock is held on behalf of GC before enqueuing references on to the pending list
>
> It is also relevant for followup discussion of
> https://bugs.openjdk.java.net/browse/JDK-8022321
> java/lang/ref/OOMEInReferenceHandler.java fails immediately
>
> and
> https://bugs.openjdk.java.net/browse/JDK-8149925
> We don't need jdk.internal.ref.Cleaner any more - part 1
>
> and has implications for:
> https://bugs.openjdk.java.net/browse/JDK-8066859
> java/lang/ref/OOMEInReferenceHandler.java failed with java.lang.Exception: Reference Handler thread died
>
> In addition to the obviously relevant changes, there are a couple of
> changes whose presence here might seem surprising and require further
> explanation.
>
> - Removal of a stale comment in create_vm, noticed because it was near
> some code being removed as part of this change.  The comment was
> rendered obsolete by JDK-8028275.
>
> - Moved initialization of exception classes earlier in
> initialize_java_lang_classes. The previous order only worked by
> accident, at least for OOME.  During the bulk of the library
> initialization, OOME may be thrown, perhaps due to poorly chosen
> command line options.  That attempts to use one of the pre-allocated
> OOME objects, and tries to fill in its stack trace.  If the Throwable
> class is not yet initialized, this leads to a failed assert in the VM
> because Throwable.UNASSIGNED_STACK still has a null value.  This
> initialization order issue was being masked by the old pending list
> implementation; the initialization of Reference ensured
> InterruptedException was initialized (thereby initializing its
> Throwable base class), and the initialization of Reference just
> happened to occur early enough that Throwable was initialized by the
> time it was needed when running certain tests.  The forced
> initialization of InterruptedException is no longer needed by
> Reference, but removal of it triggered test failures (and could
> trigger corresponding product failures) because of this ordering
> issue.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8156500
>
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8156500/jdk.00/
> http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.00/
>
> Testing:
> RBT ad hoc nightly runtime & gc tests.
>
> With the proposed patch for JDK-8153711 applied, locally ran nearly
> 35K iterations of the OomDebugTest that is part of that patch, with no
> failures / deadlocks.  Without this change it would typically only take
> on the order of 100 iterations to hit a deadlock failure.
>
> Locally ran direct byte buffer allocation test:
> jdk/test/java/nio/Buffer/DirectBufferAllocTest.java
> This change reported 3% faster allocation times, and 1/2 the standard
> deviation for allocation times.
>
> Locally ran failing test from JDK-8022321 and JDK-8066859 a few
> thousand times.
> jdk/test/java/lang/ref/OOMEInReferenceHandler.java
> No errors, but failures were noted as hard to reproduce.
>



More information about the hotspot-dev mailing list