Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

Peter Levart peter.levart at gmail.com
Wed Jan 8 13:53:21 UTC 2014


Hi Kalyan,

What hardware/OS/JVM and what JVM options are you using to reproduce 
this failure. I would really like to reproduce this myself, but all 
attempts on my PC have so far been unsuccessful. I might be able to get 
access to a machine that is similar to yours...

Regards, Peter

On 01/07/2014 09:55 PM, srikalyan chandrashekar wrote:
> Peter, getting state info out(to console or otherwise) from within 
> Reference Handler's exceptions handlers have been unsuccessful.  
> However David's suggestion produced some useful trace with fast debug 
> build and could get some information , see the log here 
> <http://cr.openjdk.java.net/%7Esrikchan/OOME_exception_trace.log> .
> ---
> Thanks
> kalyan
> On 01/07/2014 12:42 AM, Peter Levart wrote:
>> On 01/07/2014 03:15 AM, srikalyan chandrashekar wrote:
>>> Sure David will give that a try, we have so far attempted to
>>> 1. Print state data(as per the test creator peter.levart's inputs),
>>
>> Hi Kalyan,
>>
>> Have you been able to reproduce the OOME in that set-up? What was the 
>> result?
>>
>> Regards, Peter
>>
>>> 2. Use UEH(uncaught exception handler per Mandy's inputs)
>>>
>>> -- 
>>> Thanks
>>> kalyan
>>>
>>> On 1/6/14 4:40 PM, David Holmes wrote:
>>>> Back from vacation ...
>>>>
>>>> On 20/12/2013 4:49 PM, David Holmes wrote:
>>>>> On 20/12/2013 12:57 PM, srikalyan chandrashekar wrote:
>>>>>> Hi David Thanks for your comments, the unguarded part(clean and 
>>>>>> enqueue)
>>>>>> in the Reference Handler thread does not seem to create any new 
>>>>>> objects,
>>>>>> so it is the application(the test in this case) which is adding 
>>>>>> objects
>>>>>> to heap and causing the Reference Handler to die with OOME.
>>>>>
>>>>> The ReferenceHandler thread can only get OOME if it allocates 
>>>>> (directly
>>>>> or indirectly) - so there has to be something in the unguarded 
>>>>> part that
>>>>> causes this. Again it may be an implicit action in the VM - 
>>>>> similar to
>>>>> the class load issue for InterruptedException.
>>>>
>>>> Run a debug VM with -XX:+TraceExceptions to see where the OOME is 
>>>> triggered.
>>>>
>>>> David
>>>> -----
>>>>
>>>>> David
>>>>>
>>>>> I am still
>>>>>> unsure about the side effects of the code change and agree with your
>>>>>> thoughts(on memory exhaustion test's reliability).
>>>>>>
>>>>>> PS: hotspot dev alias removed from CC.
>>>>>>
>>>>>> -- 
>>>>>> Thanks
>>>>>> kalyan
>>>>>>
>>>>>> On 12/19/13 5:08 PM, David Holmes wrote:
>>>>>>> Hi Kalyan,
>>>>>>>
>>>>>>> This is not a hotspot issue so I'm moving this to core-libs, please
>>>>>>> drop hotspot from any replies.
>>>>>>>
>>>>>>> On 20/12/2013 6:26 AM, srikalyan wrote:
>>>>>>>> Hi all,  I have been working on the bug JDK-8022321
>>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8022321> , this is a 
>>>>>>>> sporadic
>>>>>>>> failure and the webrev is available here
>>>>>>>> http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/ 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> I'm really not sure what to make of this. We have a test that 
>>>>>>> triggers
>>>>>>> an out-of-memory condition but the OOME can actually turn up in the
>>>>>>> ReferenceHandler thread causing it to terminate and the test to 
>>>>>>> fail.
>>>>>>> We previously accounted for the non-obvious occurrences of OOME 
>>>>>>> due to
>>>>>>> the Object.wait and the possible need to load the 
>>>>>>> InterruptedException
>>>>>>> class - but still the OOME can appear where we don't want it. So
>>>>>>> finally you have just placed the whole for(;;) loop in a
>>>>>>> try/catch(OOME) that ignores the OOME. I'm certain that makes 
>>>>>>> the test
>>>>>>> happy, but I'm not sure it is really what we want for the
>>>>>>> ReferenceHandler thread. If the OOME occurs while cleaning, or
>>>>>>> enqueuing then we will fail to clean and/or enqueue but there 
>>>>>>> would be
>>>>>>> no indication that has occurred and I think that is a bigger 
>>>>>>> problem
>>>>>>> than this test failing.
>>>>>>>
>>>>>>> There may be no way to make this test 100% reliable. In fact I'd
>>>>>>> suggest that no memory exhaustion test can be 100% reliable.
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>> *
>>>>>>>> **"Root Cause:Still not known"*
>>>>>>>> 2 places where there is a possibility for OOME
>>>>>>>> 1) Cleaner.clean()
>>>>>>>> 2) ReferenceQueue.enqueue()
>>>>>>>>
>>>>>>>> 1)  The cleanup code in turn has 2 places where there is 
>>>>>>>> potential for
>>>>>>>> throwing OOME,
>>>>>>>>      a) thunk Thread which is run from clean() method. This 
>>>>>>>> Runnable is
>>>>>>>> passed to Cleaner and appears in the following classes
>>>>>>>>          java/nio/DirectByteBuffer.java
>>>>>>>>          sun/misc/Perf.java
>>>>>>>>          sun/nio/fs/NativeBuffer.java
>>>>>>>>          sun/nio/ch/IOVecWrapper.java
>>>>>>>>          sun/misc/Cleaner/ExitOnThrow.java
>>>>>>>> However none of the above overridden implementations ever 
>>>>>>>> create an
>>>>>>>> object in the clean() code.
>>>>>>>>      b) new PrivilegedAction created in try catch Exception 
>>>>>>>> block of
>>>>>>>> clean() method but for this object to be created and to be held
>>>>>>>> responsible for OOME an Exception(other than OOME) has to be 
>>>>>>>> thrown.
>>>>>>>>
>>>>>>>> 2) No new heap objects are created in the enqueue method nor
>>>>>>>> anywhere in
>>>>>>>> the deep call stack (VM.addFinalRefCount() etc) so this cannot 
>>>>>>>> be a
>>>>>>>> potential cause.
>>>>>>>>
>>>>>>>> *Experimental change to java.lang.Reference.java* :
>>>>>>>> - Put one more guard (try catch with OOME block) in the Reference
>>>>>>>> Handler Thread which may give the Reference Handler a chance to
>>>>>>>> cleanup.
>>>>>>>> This is fixing the test failure (several 1000 runs with 0 
>>>>>>>> failures)
>>>>>>>> - Without the above change the test fails atleast 3-5 times for 
>>>>>>>> every
>>>>>>>> 1000 run.
>>>>>>>>
>>>>>>>> *PS*: The code change is to a very critical part of JDK and i 
>>>>>>>> am fully
>>>>>>>> not aware of the consequences of the change, hence seeking 
>>>>>>>> expert help
>>>>>>>> here. Appreciate your time and inputs towards this.
>>>>>>>>
>>>>>>
>>>
>>
>




More information about the core-libs-dev mailing list