Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently
David Holmes
david.holmes at oracle.com
Fri Jan 17 02:16:59 UTC 2014
On 17/01/2014 4:48 AM, srikalyan wrote:
> Hi David
>
> On 1/15/14, 9:04 PM, David Holmes wrote:
>> On 16/01/2014 10:19 AM, srikalyan chandrashekar wrote:
>>> Hi Peter/David, we could finally get a trace of exception with fastdebug
>>> build and ReferenceHandler modified (with runImpl() added and called
>>> from run()). The logs, disassembled code is available in JIRA
>>> <https://bugs.openjdk.java.net/browse/JDK-8022321> as attachments.
>>
>> All I can see is the log for the OOMECatchingTest program not one for
>> the actual ReferenceHandler ??
>>
> Please search for ReferenceHandler in the log.
>>> Observations from the log:
>>>
>>> Root Cause:
>>> 1) UncaughtException is being dispatched from Reference.java:143
>>> 141 Reference<Object> r;
>>> 142 synchronized (lock) {
>>> 143 if (pending != null) {
>>> 144 r = pending;
>>> 145 pending = r.discovered;
>>> 146 r.discovered = null;
>>>
>>> pending field in Reference is touched and updated by the collector, so
>>> at line 143 when the execution context is in Reference handler there
>>> might have been an Exception pending due to allocation done by collector
>>> which causes ReferenceHandler thread to die.
>>
>> Sorry but the GC does not trigger asynchronous exceptions so this
>> explanation does not make any sense to me. What part of the log led
>> you to this conclusion?
> ------------------ Log Excerpt begins ------------------
> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff7808e8)
> thrown
> [/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp,
> line 168]
> for thread 0x00007feed80cf800
> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff7808e8)
> thrown in interpreter method <{method} {0x00007feeddd3c600} 'runImpl'
> '()V' in 'java/lang/ref/Reference$ReferenceHandler'>
> at bci 65 for thread 0x00007feed80cf800
> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff7808e8)
> thrown in interpreter method <{method} {0x00007feeddd3c478} 'run'
> '()V' in 'java/lang/ref/Reference$ReferenceHandler'>
> at bci 1 for thread 0x00007feed80cf800
> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff780868)
> thrown
> [/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp,
> line 157]
> for thread 0x00007feed80cf800
> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff780868)
> thrown in interpreter method <{method} {0x00007feeddcaaf90}
> 'uncaughtException' '(Ljava/lang/Thread;Ljava/lang/Throwable;)V' in '>
> at bci 48 for thread 0x00007feed80cf800
> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000ff780868)
> thrown in interpreter method <{method} {0x00007feeddca7298}
> 'dispatchUncaughtException' '(Ljava/lang/Throwable;)V' in 'java/lang/>
> at bci 6 for thread 0x00007feed80cf800
> ------------------ Log Excerpt ends ------------------
> Sorry if it is a wrong understanding.
What you are seeing there is an OOME escaping the run() method which
will cause the uncaughtExceptionHandler to be run which then triggers a
second OOME (likely as it tries to report information about the first
OOME). The first exception occurred in runImpl at BCI 65. Can you
disassemble (javap -c) the class you used so we can see what is at BCI 65.
Thanks,
David
>>
>>> Suggested fix:
>>> - As proposed earlier putting an outer guard(try-catch on OOME) in the
>>> ReferenceHandler will fix the issue, if ReferenceHandler is considered
>>> as part of the GC sub system then it should be alive even in the midst
>>> of an OOME so i feel that the additional guard should be allowed,
>>> however i might still be ignorant of vital implications.
>>> - Apart from the above changes, Peter's suggestion to create and call a
>>> private runImpl() from run() in ReferenceHandler makes sense to me.
>>
>> Why would we need this?
>>
>> David
>> -----
>>
>>>
>>> ---
>>> Thanks
>>> kalyan
>>>
>>> On 01/13/2014 03:57 PM, srikalyan wrote:
>>>>
>>>> On 1/11/14, 6:15 AM, Peter Levart wrote:
>>>>>
>>>>> On 01/10/2014 10:51 PM, srikalyan chandrashekar wrote:
>>>>>> Hi Peter the version you provided ran indefinitely(i put a 10 minute
>>>>>> timeout) and the program got interrupted(no error),
>>>>>
>>>>> Did you run it with or without fastedbug & -XX:+TraceExceptions ? If
>>>>> with, it might be that fastdebug and/or -XX:+TraceExceptions changes
>>>>> the execution a bit so that we can no longer reproduce the wrong
>>>>> behaviour.
>>>> With fastdebug & -XX:TraceExceptions. I will try combination of
>>>> possible options(i.e without -XX:TraceEception on debug build etc)
>>>> soon.
>>>>>
>>>>>> even if there were to be an error you cannot print the "string" of
>>>>>> thread to console(these have been attempted earlier).
>>>>>
>>>>> ...it has been attempted to print toString in uncaught exception
>>>>> handler. At that time, the heap is still full. I'm printing it after
>>>>> the GC has cleared the heap. You can try that it works by commenting
>>>>> out the "try {" and corresponding "} catch (OOME x) {}" exception
>>>>> handler...
>>>> Since there is a GC call prior to printing string i will give that a
>>>> shot with non-debug build.
>>>>>
>>>>>> - The test's running on interpreter mode, what i am watching for is
>>>>>> one error with trace. Without fastdebug build and
>>>>>> -XX:+TraceExceptions i am able to reproduce failure atleast 5
>>>>>> failures out of 1000 runs but with fastdebug+Trace no luck
>>>>>> yet(already past few 1000 runs).
>>>>>
>>>>> It might be interesting to try with fastebug build but without the
>>>>> -XX:+TraceExceptions option to see what has an effect on it. It might
>>>>> also be interesting to try the modified ReferenceHandler (the one
>>>>> with private runImpl() method called from run()) and with normal
>>>>> non-fastdebug JDK. This info might be useful when one starts to
>>>>> inspect the exception handling code in interpreter...
>>>>>
>>>>> Regards, Peter
>>>>>
>>>>
>>>> --
>>>> Thanks
>>>> kalyan
>>>> Ph: (408)-585-8040
>>>>
>>>>>>
>>>>>> ---
>>>>>> Thanks
>>>>>> kalyan
>>>>>>
>>>>>> On 01/10/2014 02:57 AM, Peter Levart wrote:
>>>>>>> On 01/10/2014 09:31 AM, Peter Levart wrote:
>>>>>>>> Since we suspect there's something wrong with exception handling
>>>>>>>> in interpreter, I devised a hypothetical reproducer that tries to
>>>>>>>> simulate ReferenceHandler in many aspects, but doesn't require to
>>>>>>>> be a ReferenceHandler:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~plevart/misc/OOME/OOMECatchingTest.java
>>>>>>>>
>>>>>>>> This is designed to run indefinitely and only terminate if/when
>>>>>>>> thread dies. Could you run this program in the environment that
>>>>>>>> causes the OOMEInReferenceHandler test to fail and see if it
>>>>>>>> terminates?
>>>>>>>
>>>>>>> I forgot to mention that in order for this long-running program to
>>>>>>> exhibit interpreter behaviour, it should be run with -Xint option.
>>>>>>> So I suggest:
>>>>>>>
>>>>>>> -Xmx24M -XX:-UseTLAB -Xint
>>>>>>>
>>>>>>> Regards, Peter
>>>>>>>
>>>>>>
>>>>>
>>>
More information about the core-libs-dev
mailing list