Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

Mandy Chung mandy.chung at oracle.com
Fri Dec 20 04:33:18 UTC 2013


Hi Srikalyan,

Maybe you can get add an uncaught handler to see if you can get
any information.  I ran it for 1000 times but not able to duplicate
the failure.  Did you run it with jtreg (I didn't)?

Below is the patch to install a thread's uncaught handler that
you can take and try.

diff --git a/test/java/lang/ref/OOMEInReferenceHandler.java b/test/java/lang/ref/OOMEInReferenceHand
ler.java
--- a/test/java/lang/ref/OOMEInReferenceHandler.java
+++ b/test/java/lang/ref/OOMEInReferenceHandler.java
@@ -51,6 +51,14 @@
           return first;
       }

+     static class UEH implements Thread.UncaughtExceptionHandler {
+         public void uncaughtException(Thread t, Throwable e) {
+             System.err.println("ERROR: " + t.getName() + " exception " +
+                 e.getMessage());
+             e.printStackTrace();
+         }
+     }
+
       public static void main(String[] args) throws Exception {
           // preinitialize the InterruptedException class so that the reference handler
           // does not die due to OOME when loading the class if it is the first use
@@ -77,6 +85,8 @@
               throw new IllegalStateException("Couldn't find Reference Handler thread.");
           }

+         referenceHandlerThread.setUncaughtExceptionHandler(new UEH());
+
           ReferenceQueue<Object> refQueue = new ReferenceQueue<>();
           Object referent = new Object();
           WeakReference<Object> weakRef = new WeakReference<>(referent, refQueue);

On 12/19/2013 6:57 PM, srikalyan chandrashekar wrote:
> Hi David Thanks for your comments, the unguarded part(clean and 
> enqueue) in the Reference Handler thread does not seem to create any 
> new objects, so it is the application(the test in this case) which is 
> adding objects to heap and causing the Reference Handler to die with 
> OOME. I am still unsure about the side effects of the code change and 
> agree with your thoughts(on memory exhaustion test's reliability).
>
> PS: hotspot dev alias removed from CC.
>
> -- 
> Thanks
> kalyan
>
> On 12/19/13 5:08 PM, David Holmes wrote:
>> Hi Kalyan,
>>
>> This is not a hotspot issue so I'm moving this to core-libs, please 
>> drop hotspot from any replies.
>>
>> On 20/12/2013 6:26 AM, srikalyan wrote:
>>> Hi all,  I have been working on the bug JDK-8022321
>>> <https://bugs.openjdk.java.net/browse/JDK-8022321> , this is a sporadic
>>> failure and the webrev is available here
>>> http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/ 
>>>
>>
>> I'm really not sure what to make of this. We have a test that 
>> triggers an out-of-memory condition but the OOME can actually turn up 
>> in the ReferenceHandler thread causing it to terminate and the test 
>> to fail. We previously accounted for the non-obvious occurrences of 
>> OOME due to the Object.wait and the possible need to load the 
>> InterruptedException class - but still the OOME can appear where we 
>> don't want it. So finally you have just placed the whole for(;;) loop 
>> in a try/catch(OOME) that ignores the OOME. I'm certain that makes 
>> the test happy, but I'm not sure it is really what we want for the 
>> ReferenceHandler thread. If the OOME occurs while cleaning, or 
>> enqueuing then we will fail to clean and/or enqueue but there would 
>> be no indication that has occurred and I think that is a bigger 
>> problem than this test failing.
>>
>> There may be no way to make this test 100% reliable. In fact I'd 
>> suggest that no memory exhaustion test can be 100% reliable.
>>
>> David
>>
>>> *
>>> **"Root Cause:Still not known"*
>>> 2 places where there is a possibility for OOME
>>> 1) Cleaner.clean()
>>> 2) ReferenceQueue.enqueue()
>>>
>>> 1)  The cleanup code in turn has 2 places where there is potential for
>>> throwing OOME,
>>>      a) thunk Thread which is run from clean() method. This Runnable is
>>> passed to Cleaner and appears in the following classes
>>>          java/nio/DirectByteBuffer.java
>>>          sun/misc/Perf.java
>>>          sun/nio/fs/NativeBuffer.java
>>>          sun/nio/ch/IOVecWrapper.java
>>>          sun/misc/Cleaner/ExitOnThrow.java
>>> However none of the above overridden implementations ever create an
>>> object in the clean() code.
>>>      b) new PrivilegedAction created in try catch Exception block of
>>> clean() method but for this object to be created and to be held
>>> responsible for OOME an Exception(other than OOME) has to be thrown.
>>>
>>> 2) No new heap objects are created in the enqueue method nor 
>>> anywhere in
>>> the deep call stack (VM.addFinalRefCount() etc) so this cannot be a
>>> potential cause.
>>>
>>> *Experimental change to java.lang.Reference.java* :
>>> - Put one more guard (try catch with OOME block) in the Reference
>>> Handler Thread which may give the Reference Handler a chance to 
>>> cleanup.
>>> This is fixing the test failure (several 1000 runs with 0 failures)
>>> - Without the above change the test fails atleast 3-5 times for every
>>> 1000 run.
>>>
>>> *PS*: The code change is to a very critical part of JDK and i am fully
>>> not aware of the consequences of the change, hence seeking expert help
>>> here. Appreciate your time and inputs towards this.
>>>
>




More information about the core-libs-dev mailing list