Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

srikalyan srikalyan.chandrashekar at oracle.com
Fri Dec 20 21:00:54 UTC 2013


Hi Mandy, yes I ran with JTreg to simulate the failure, i will try the 
UEH patch to see if it sheds some light and get back to you. Thanks for 
the direction :)

--
Thanks
kalyan
Ph: (408)-585-8040


On 12/19/13, 8:33 PM, Mandy Chung wrote:
> Hi Srikalyan,
>
> Maybe you can get add an uncaught handler to see if you can get
> any information.  I ran it for 1000 times but not able to duplicate
> the failure.  Did you run it with jtreg (I didn't)?
>
> Below is the patch to install a thread's uncaught handler that
> you can take and try.
>
> diff --git a/test/java/lang/ref/OOMEInReferenceHandler.java 
> b/test/java/lang/ref/OOMEInReferenceHand
> ler.java
> --- a/test/java/lang/ref/OOMEInReferenceHandler.java
> +++ b/test/java/lang/ref/OOMEInReferenceHandler.java
> @@ -51,6 +51,14 @@
>           return first;
>       }
>
> +     static class UEH implements Thread.UncaughtExceptionHandler {
> +         public void uncaughtException(Thread t, Throwable e) {
> +             System.err.println("ERROR: " + t.getName() + " exception 
> " +
> +                 e.getMessage());
> +             e.printStackTrace();
> +         }
> +     }
> +
>       public static void main(String[] args) throws Exception {
>           // preinitialize the InterruptedException class so that the 
> reference handler
>           // does not die due to OOME when loading the class if it is 
> the first use
> @@ -77,6 +85,8 @@
>               throw new IllegalStateException("Couldn't find Reference 
> Handler thread.");
>           }
>
> +         referenceHandlerThread.setUncaughtExceptionHandler(new UEH());
> +
>           ReferenceQueue<Object> refQueue = new ReferenceQueue<>();
>           Object referent = new Object();
>           WeakReference<Object> weakRef = new 
> WeakReference<>(referent, refQueue);
>
> On 12/19/2013 6:57 PM, srikalyan chandrashekar wrote:
>> Hi David Thanks for your comments, the unguarded part(clean and 
>> enqueue) in the Reference Handler thread does not seem to create any 
>> new objects, so it is the application(the test in this case) which is 
>> adding objects to heap and causing the Reference Handler to die with 
>> OOME. I am still unsure about the side effects of the code change and 
>> agree with your thoughts(on memory exhaustion test's reliability).
>>
>> PS: hotspot dev alias removed from CC.
>>
>> -- 
>> Thanks
>> kalyan
>>
>> On 12/19/13 5:08 PM, David Holmes wrote:
>>> Hi Kalyan,
>>>
>>> This is not a hotspot issue so I'm moving this to core-libs, please 
>>> drop hotspot from any replies.
>>>
>>> On 20/12/2013 6:26 AM, srikalyan wrote:
>>>> Hi all,  I have been working on the bug JDK-8022321
>>>> <https://bugs.openjdk.java.net/browse/JDK-8022321> , this is a 
>>>> sporadic
>>>> failure and the webrev is available here
>>>> http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/ 
>>>>
>>>
>>> I'm really not sure what to make of this. We have a test that 
>>> triggers an out-of-memory condition but the OOME can actually turn 
>>> up in the ReferenceHandler thread causing it to terminate and the 
>>> test to fail. We previously accounted for the non-obvious 
>>> occurrences of OOME due to the Object.wait and the possible need to 
>>> load the InterruptedException class - but still the OOME can appear 
>>> where we don't want it. So finally you have just placed the whole 
>>> for(;;) loop in a try/catch(OOME) that ignores the OOME. I'm certain 
>>> that makes the test happy, but I'm not sure it is really what we 
>>> want for the ReferenceHandler thread. If the OOME occurs while 
>>> cleaning, or enqueuing then we will fail to clean and/or enqueue but 
>>> there would be no indication that has occurred and I think that is a 
>>> bigger problem than this test failing.
>>>
>>> There may be no way to make this test 100% reliable. In fact I'd 
>>> suggest that no memory exhaustion test can be 100% reliable.
>>>
>>> David
>>>
>>>> *
>>>> **"Root Cause:Still not known"*
>>>> 2 places where there is a possibility for OOME
>>>> 1) Cleaner.clean()
>>>> 2) ReferenceQueue.enqueue()
>>>>
>>>> 1)  The cleanup code in turn has 2 places where there is potential for
>>>> throwing OOME,
>>>>      a) thunk Thread which is run from clean() method. This 
>>>> Runnable is
>>>> passed to Cleaner and appears in the following classes
>>>>          java/nio/DirectByteBuffer.java
>>>>          sun/misc/Perf.java
>>>>          sun/nio/fs/NativeBuffer.java
>>>>          sun/nio/ch/IOVecWrapper.java
>>>>          sun/misc/Cleaner/ExitOnThrow.java
>>>> However none of the above overridden implementations ever create an
>>>> object in the clean() code.
>>>>      b) new PrivilegedAction created in try catch Exception block of
>>>> clean() method but for this object to be created and to be held
>>>> responsible for OOME an Exception(other than OOME) has to be thrown.
>>>>
>>>> 2) No new heap objects are created in the enqueue method nor 
>>>> anywhere in
>>>> the deep call stack (VM.addFinalRefCount() etc) so this cannot be a
>>>> potential cause.
>>>>
>>>> *Experimental change to java.lang.Reference.java* :
>>>> - Put one more guard (try catch with OOME block) in the Reference
>>>> Handler Thread which may give the Reference Handler a chance to 
>>>> cleanup.
>>>> This is fixing the test failure (several 1000 runs with 0 failures)
>>>> - Without the above change the test fails atleast 3-5 times for every
>>>> 1000 run.
>>>>
>>>> *PS*: The code change is to a very critical part of JDK and i am fully
>>>> not aware of the consequences of the change, hence seeking expert help
>>>> here. Appreciate your time and inputs towards this.
>>>>
>>
>



More information about the core-libs-dev mailing list