Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

Peter Levart peter.levart at gmail.com
Fri Jan 17 13:13:23 UTC 2014


On 01/17/2014 02:00 PM, Peter Levart wrote:
> On 01/17/2014 05:38 AM, David Holmes wrote:
>> On 17/01/2014 1:31 PM, srikalyan chandrashekar wrote:
>>> Hi David, the disassembled code is also attached to the bug. Per my
>>
>> Sorry missed that.
>>
>>> analysis the exception was thrown when Reference Handler was on line 
>>> 143
>>> as put in the earlier email.
>>
>> But if the numbers in the dissassembly match the BCI then 65 shows:
>>
>>       65: instanceof    #11                 // class sun/misc/Cleaner
>>
>> which makes more sense, the runtime instanceof check might encounter 
>> an OOME condition. I wish there was some easy way to trace into the 
>> full call chain as TraceExceptions doesn't show you any runtime 
>> frames :(
>>
>> Still, it is easy enough to check:
>>
>> // Fast path for cleaners
>> boolean isCleaner = false;
>> try {
>>   isCleaner = r instanceof Cleaner;
>> } catch (OutofMemoryError oome) {
>>   continue;
>> }
>>
>> if (isCleaner) {
>>   ((Cleaner)r).clean();
>>   continue;
>> }
>>
>
> Hi David, Kalyan,
>
> I've caught-up now. Just thinking: is "instanceof Cleaner" throwing 
> OOME as a result of loading the Cleaner class? Wouldn't the above code 
> then throw some error also in ((Cleaner)r) - the checkcast, since 
> Cleaner class would not be successfully initialized? 

Well, no. The above code would just skip Cleaner processing in this 
situation. And will never be doing it again after the heap is freed... 
So it might be good to load and initialize Cleaner class as part of 
ReferenceHandler initialization to ensure correct operation...

Peter

> Perhaps we should pre-load and initialize the Cleaner class as part of 
> ReferenceHandler initialization...
>
> Regards, Peter
>
>> Thanks,
>> David
>>
>>> -- 
>>> Thanks
>>> kalyan
>>>
>>> On 1/16/14 6:16 PM, David Holmes wrote:
>>>> On 17/01/2014 4:48 AM, srikalyan wrote:
>>>>> Hi David
>>>>>
>>>>> On 1/15/14, 9:04 PM, David Holmes wrote:
>>>>>> On 16/01/2014 10:19 AM, srikalyan chandrashekar wrote:
>>>>>>> Hi Peter/David, we could finally get a trace of exception with
>>>>>>> fastdebug
>>>>>>> build and ReferenceHandler modified (with runImpl() added and 
>>>>>>> called
>>>>>>> from run()). The logs, disassembled code is available in JIRA
>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8022321> as attachments.
>>>>>>
>>>>>> All I can see is the log for the OOMECatchingTest program not one 
>>>>>> for
>>>>>> the actual ReferenceHandler ??
>>>>>>
>>>>> Please search for ReferenceHandler in the log.
>>>>>>> Observations from the log:
>>>>>>>
>>>>>>> Root Cause:
>>>>>>> 1) UncaughtException is being dispatched from Reference.java:143
>>>>>>> 141                   Reference<Object> r;
>>>>>>> 142                   synchronized (lock) {
>>>>>>> 143                        if (pending != null) {
>>>>>>> 144                            r = pending;
>>>>>>> 145                            pending = r.discovered;
>>>>>>> 146                            r.discovered = null;
>>>>>>>
>>>>>>> pending field in Reference is touched and updated by the 
>>>>>>> collector, so
>>>>>>> at line 143 when the execution context is in Reference handler 
>>>>>>> there
>>>>>>> might have been an Exception pending due to allocation done by
>>>>>>> collector
>>>>>>> which causes ReferenceHandler thread to die.
>>>>>>
>>>>>> Sorry but the GC does not trigger asynchronous exceptions so this
>>>>>> explanation does not make any sense to me. What part of the log led
>>>>>> you to this conclusion?
>>>>> ------------------ Log Excerpt begins ------------------
>>>>> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff7808e8)
>>>>> thrown
>>>>> [/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, 
>>>>>
>>>>>
>>>>> line 168]
>>>>> for thread 0x00007feed80cf800
>>>>> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff7808e8)
>>>>>   thrown in interpreter method <{method} {0x00007feeddd3c600} 
>>>>> 'runImpl'
>>>>> '()V' in 'java/lang/ref/Reference$ReferenceHandler'>
>>>>>   at bci 65 for thread 0x00007feed80cf800
>>>>> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff7808e8)
>>>>>   thrown in interpreter method <{method} {0x00007feeddd3c478} 'run'
>>>>> '()V' in 'java/lang/ref/Reference$ReferenceHandler'>
>>>>>   at bci 1 for thread 0x00007feed80cf800
>>>>> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff780868)
>>>>> thrown
>>>>> [/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, 
>>>>>
>>>>>
>>>>> line 157]
>>>>> for thread 0x00007feed80cf800
>>>>> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff780868)
>>>>>   thrown in interpreter method <{method} {0x00007feeddcaaf90}
>>>>> 'uncaughtException' '(Ljava/lang/Thread;Ljava/lang/Throwable;)V' 
>>>>> in '>
>>>>>   at bci 48 for thread 0x00007feed80cf800
>>>>> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff780868)
>>>>>   thrown in interpreter method <{method} {0x00007feeddca7298}
>>>>> 'dispatchUncaughtException' '(Ljava/lang/Throwable;)V' in 
>>>>> 'java/lang/>
>>>>>   at bci 6 for thread 0x00007feed80cf800
>>>>> ------------------ Log Excerpt ends ------------------
>>>>> Sorry if it is a wrong understanding.
>>>>
>>>> What you are seeing there is an OOME escaping the run() method which
>>>> will cause the uncaughtExceptionHandler to be run which then triggers
>>>> a second OOME (likely as it tries to report information about the
>>>> first OOME). The first exception occurred in runImpl at BCI 65. Can
>>>> you disassemble (javap -c) the class you used so we can see what is at
>>>> BCI 65.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>>>
>>>>>>> Suggested fix:
>>>>>>> - As proposed earlier putting an outer guard(try-catch on OOME) 
>>>>>>> in the
>>>>>>> ReferenceHandler will fix the issue, if ReferenceHandler is 
>>>>>>> considered
>>>>>>> as part of the GC sub system then it should be alive even in the 
>>>>>>> midst
>>>>>>> of an OOME so i feel that the additional guard should be allowed,
>>>>>>> however i might still be ignorant of vital implications.
>>>>>>> - Apart from the above changes, Peter's suggestion to create and
>>>>>>> call a
>>>>>>> private runImpl() from run() in ReferenceHandler makes sense to me.
>>>>>>
>>>>>> Why would we need this?
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>> Thanks
>>>>>>> kalyan
>>>>>>>
>>>>>>> On 01/13/2014 03:57 PM, srikalyan wrote:
>>>>>>>>
>>>>>>>> On 1/11/14, 6:15 AM, Peter Levart wrote:
>>>>>>>>>
>>>>>>>>> On 01/10/2014 10:51 PM, srikalyan chandrashekar wrote:
>>>>>>>>>> Hi Peter the version you provided ran indefinitely(i put a 10
>>>>>>>>>> minute
>>>>>>>>>> timeout) and the program got interrupted(no error),
>>>>>>>>>
>>>>>>>>> Did you run it with or without fastedbug & 
>>>>>>>>> -XX:+TraceExceptions ? If
>>>>>>>>> with, it might be that fastdebug and/or -XX:+TraceExceptions 
>>>>>>>>> changes
>>>>>>>>> the execution a bit so that we can no longer reproduce the wrong
>>>>>>>>> behaviour.
>>>>>>>> With fastdebug & -XX:TraceExceptions. I will try combination of
>>>>>>>> possible options(i.e without -XX:TraceEception on debug build etc)
>>>>>>>> soon.
>>>>>>>>>
>>>>>>>>>> even if there were to be an error you cannot print the 
>>>>>>>>>> "string" of
>>>>>>>>>> thread to console(these have been attempted earlier).
>>>>>>>>>
>>>>>>>>> ...it has been attempted to print toString in uncaught exception
>>>>>>>>> handler. At that time, the heap is still full. I'm printing it 
>>>>>>>>> after
>>>>>>>>> the GC has cleared the heap. You can try that it works by 
>>>>>>>>> commenting
>>>>>>>>> out the "try {" and corresponding "} catch (OOME x) {}" exception
>>>>>>>>> handler...
>>>>>>>> Since there is a GC call prior to printing string i will give 
>>>>>>>> that a
>>>>>>>> shot with non-debug build.
>>>>>>>>>
>>>>>>>>>> - The test's running on interpreter mode, what i am watching 
>>>>>>>>>> for is
>>>>>>>>>> one error with trace. Without fastdebug build and
>>>>>>>>>> -XX:+TraceExceptions i am able to reproduce failure atleast 5
>>>>>>>>>> failures out of 1000 runs but with fastdebug+Trace no luck
>>>>>>>>>> yet(already past few 1000 runs).
>>>>>>>>>
>>>>>>>>> It might be interesting to try with fastebug build but without 
>>>>>>>>> the
>>>>>>>>> -XX:+TraceExceptions option to see what has an effect on it. It
>>>>>>>>> might
>>>>>>>>> also be interesting to try the modified ReferenceHandler (the one
>>>>>>>>> with private runImpl() method called from run()) and with normal
>>>>>>>>> non-fastdebug JDK. This info might be useful when one starts to
>>>>>>>>> inspect the exception handling code in interpreter...
>>>>>>>>>
>>>>>>>>> Regards, Peter
>>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> Thanks
>>>>>>>> kalyan
>>>>>>>> Ph: (408)-585-8040
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>> Thanks
>>>>>>>>>> kalyan
>>>>>>>>>>
>>>>>>>>>> On 01/10/2014 02:57 AM, Peter Levart wrote:
>>>>>>>>>>> On 01/10/2014 09:31 AM, Peter Levart wrote:
>>>>>>>>>>>> Since we suspect there's something wrong with exception 
>>>>>>>>>>>> handling
>>>>>>>>>>>> in interpreter, I devised a hypothetical reproducer that 
>>>>>>>>>>>> tries to
>>>>>>>>>>>> simulate ReferenceHandler in many aspects, but doesn't 
>>>>>>>>>>>> require to
>>>>>>>>>>>> be a ReferenceHandler:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~plevart/misc/OOME/OOMECatchingTest.java 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This is designed to run indefinitely and only terminate 
>>>>>>>>>>>> if/when
>>>>>>>>>>>> thread dies. Could you run this program in the environment 
>>>>>>>>>>>> that
>>>>>>>>>>>> causes the OOMEInReferenceHandler test to fail and see if it
>>>>>>>>>>>> terminates?
>>>>>>>>>>>
>>>>>>>>>>> I forgot to mention that in order for this long-running 
>>>>>>>>>>> program to
>>>>>>>>>>> exhibit interpreter behaviour, it should be run with -Xint 
>>>>>>>>>>> option.
>>>>>>>>>>> So I suggest:
>>>>>>>>>>>
>>>>>>>>>>> -Xmx24M -XX:-UseTLAB -Xint
>>>>>>>>>>>
>>>>>>>>>>> Regards, Peter
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>
>




More information about the core-libs-dev mailing list