RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

Per Liden per.liden at oracle.com
Tue Jun 28 09:33:50 UTC 2016


Hi Kim,

On 2016-06-24 22:09, Kim Barrett wrote:
> Please review this change which moves the Reference pending list and
> locking from the java.lang.ref.Reference class into the VM.  This
> includes elimination of the ReferencePendingListLocker mechanism in
> the VM. This fixes various fragility issues arising from having the
> management of this list split between Java and the VM (GC).
>
> This change addresses JDK-8156500 by eliminating the possibility of
> suspension while holding the lock. Because the locking is now done in
> the VM, we have full control over where suspension may occur.
>
> This change retains the existing interface between the reference
> processor and the nio.Bits package, e.g. Reference.tryHandlePending
> has the same signature and behavior; this change just pushes the
> pending list manipulation down into the VM.  There are open bugs
> reports, enhancement requests, and discussions related to that
> interface; see below. This change does not attempt to address them.
>
> This change additionally addresses or renders moot
> https://bugs.openjdk.java.net/browse/JDK-8055232
> (ref) Exceptions while processing Reference pending list
>
> and
> https://bugs.openjdk.java.net/browse/JDK-7103238
> Ensure pending list lock is held on behalf of GC before enqueuing references on to the pending list
>
> It is also relevant for followup discussion of
> https://bugs.openjdk.java.net/browse/JDK-8022321
> java/lang/ref/OOMEInReferenceHandler.java fails immediately
>
> and
> https://bugs.openjdk.java.net/browse/JDK-8149925
> We don't need jdk.internal.ref.Cleaner any more - part 1
>
> and has implications for:
> https://bugs.openjdk.java.net/browse/JDK-8066859
> java/lang/ref/OOMEInReferenceHandler.java failed with java.lang.Exception: Reference Handler thread died
>
> In addition to the obviously relevant changes, there are a couple of
> changes whose presence here might seem surprising and require further
> explanation.
>
> - Removal of a stale comment in create_vm, noticed because it was near
> some code being removed as part of this change.  The comment was
> rendered obsolete by JDK-8028275.
>
> - Moved initialization of exception classes earlier in
> initialize_java_lang_classes. The previous order only worked by
> accident, at least for OOME.  During the bulk of the library
> initialization, OOME may be thrown, perhaps due to poorly chosen
> command line options.  That attempts to use one of the pre-allocated
> OOME objects, and tries to fill in its stack trace.  If the Throwable
> class is not yet initialized, this leads to a failed assert in the VM
> because Throwable.UNASSIGNED_STACK still has a null value.  This
> initialization order issue was being masked by the old pending list
> implementation; the initialization of Reference ensured
> InterruptedException was initialized (thereby initializing its
> Throwable base class), and the initialization of Reference just
> happened to occur early enough that Throwable was initialized by the
> time it was needed when running certain tests.  The forced
> initialization of InterruptedException is no longer needed by
> Reference, but removal of it triggered test failures (and could
> trigger corresponding product failures) because of this ordering
> issue.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8156500

Patch looks good. The only thing I don't feel qualified to review is the 
initialization order change in thread.cpp, so I'll let others comments 
on that.

I like the pop-one-reference-at-a-time semantics, which simplifies 
things a lot and keeps the interface nice and clean. I was previously 
afraid that it might cause a noticeable performance degradation compared 
to lifting the whole list into Java in one go, but your testing seem to 
prove that's not the case.

cheers,
Per

>
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8156500/jdk.00/
> http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.00/
>
> Testing:
> RBT ad hoc nightly runtime & gc tests.
>
> With the proposed patch for JDK-8153711 applied, locally ran nearly
> 35K iterations of the OomDebugTest that is part of that patch, with no
> failures / deadlocks.  Without this change it would typically only take
> on the order of 100 iterations to hit a deadlock failure.
>
> Locally ran direct byte buffer allocation test:
> jdk/test/java/nio/Buffer/DirectBufferAllocTest.java
> This change reported 3% faster allocation times, and 1/2 the standard
> deviation for allocation times.
>
> Locally ran failing test from JDK-8022321 and JDK-8066859 a few
> thousand times.
> jdk/test/java/lang/ref/OOMEInReferenceHandler.java
> No errors, but failures were noted as hard to reproduce.
>


More information about the hotspot-dev mailing list