Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

David Holmes david.holmes at oracle.com
Thu Jan 9 07:22:21 UTC 2014


Thanks Peter.

Kalyan: Can you confirm, as Peter asked, that the TraceExceptions output 
came from a failed run?

AFAICS the Trace info is printed after each bytecode where there is a 
pending exception - though I'm not 100% sure on the printing within the 
VM runtime. Based on that I think we see the Trace output in run() at 
the point where wait() returns, so it may well be caught after that - in 
which case this was not a failing run.

I also can't reproduce the problem :(

David

On 8/01/2014 10:34 PM, Peter Levart wrote:
> On 01/08/2014 07:30 AM, David Holmes wrote:
>> On 8/01/2014 4:19 PM, David Holmes wrote:
>>> On 8/01/2014 7:33 AM, srikalyan chandrashekar wrote:
>>>> Hi David, TraceExceptions with fastdebug build produced some nice trace
>>>> <http://cr.openjdk.java.net/%7Esrikchan/OOME_exception_trace.log> . The
>>>> native method wait(long) is where the OOME if being thrown, the deepest
>>>> call is in
>>>>
>>>> src/share/vm/gc_interface/collectedHeap.inline.hpp, line 157
>>>
>>> Yes but it is the caller that is of interest:
>>>
>>> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000d6a01840)
>>> thrown
>>> [/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/runtime/objectMonitor.cpp,
>>>
>>> line 1649]
>>> for thread 0x00007f78c40d2800
>>> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000d6a01840)
>>>   thrown in interpreter method <{method} {0x00007f78b4800ae0} 'wait'
>>> '(J)V' in 'java/lang/Object'>
>>>   at bci 0 for thread 0x00007f78c40d2800
>>>
>>> The ReferenceHandler thread gets the OOME trying to allocate the
>>> InterruptedException.
>>
>> However we already have a catch block around the wait() so how is this
>> OOME getting through? A bug in exception handling in the interpreter ??
>>
>
> Might be. And it may have something to do with the fact that the
> Thread.run() method is the 1st call frame on the thread's stack (seems
> like corner case). The last few meaningful TraceExceptions records are:
>
>
> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000d6a01840)
> thrown
> [/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp,
> line 157]
> for thread 0x00007f78c40d2800
> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000d6a01840)
> thrown
> [/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/runtime/objectMonitor.cpp,
> line 1649]
> for thread 0x00007f78c40d2800
> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000d6a01840)
>   thrown in interpreter method <{method} {0x00007f78b4800ae0} 'wait'
> '(J)V' in 'java/lang/Object'>
>   at bci 0 for thread 0x00007f78c40d2800
> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000d6a01840)
>   thrown in interpreter method <{method} {0x00007f78b4800ca8} 'wait'
> '()V' in 'java/lang/Object'>
>   at *bci 2* for thread 0x00007f78c40d2800
> Exception <a 'java/lang/OutOfMemoryError'> (0x00000000d6a01840)
>   thrown in interpreter method <{method} {0x00007f78b48d2250} 'run'
> '()V' in 'java/lang/ref/Reference$ReferenceHandler'>
>   at *bci 36* for thread 0x00007f78c40d2800
>
>
> Here's the relevant bytecodes:
>
>
> public class java.lang.Object
>
>    public final void wait() throws java.lang.InterruptedException;
>      descriptor: ()V
>      flags: ACC_PUBLIC, ACC_FINAL
>      Code:
>        stack=3, locals=1, args_size=1
>           0: aload_0
>           1: lconst_0
> *         2: invokevirtual #73                 // Method wait:(J)V*
>           5: return
>        LineNumberTable:
>          line 502: 0
>          line 503: 5
>      Exceptions:
>        throws java.lang.InterruptedException
>
>
> class java.lang.ref.Reference$ReferenceHandler extends java.lang.Thread
>
>    public void run();
>      descriptor: ()V
>      flags: ACC_PUBLIC
>      Code:
>        stack=2, locals=5, args_size=1
>           0: invokestatic  #62                 // Method
> java/lang/ref/Reference.access$100:()Ljava/lang/ref/Reference$Lock;
>           3: dup
>           4: astore_2
>           5: monitorenter
>           6: invokestatic  #61                 // Method
> java/lang/ref/Reference.access$200:()Ljava/lang/ref/Reference;
>           9: ifnull        33
>          12: invokestatic  #61                 // Method
> java/lang/ref/Reference.access$200:()Ljava/lang/ref/Reference;
>          15: astore_1
>          16: aload_1
>          17: invokestatic  #64                 // Method
> java/lang/ref/Reference.access$300:(Ljava/lang/ref/Reference;)Ljava/lang/ref/Reference;
>          20: invokestatic  #63                 // Method
> java/lang/ref/Reference.access$202:(Ljava/lang/ref/Reference;)Ljava/lang/ref/Reference;
>          23: pop
>          24: aload_1
>          25: aconst_null
>          26: invokestatic  #65                 // Method
> java/lang/ref/Reference.access$302:(Ljava/lang/ref/Reference;Ljava/lang/ref/Reference;)Ljava/lang/ref/Reference;
>          29: pop
>          30: goto          52
>          33: invokestatic  #62                 // Method
> java/lang/ref/Reference.access$100:()Ljava/lang/ref/Reference$Lock;
> *        36: invokevirtual #59                 // Method
> java/lang/Object.wait:()V*
>          39: goto          43
>          42: astore_3
>          43: goto          47
>          46: astore_3
>          47: aload_2
>          48: monitorexit
>          49: goto          0
>          52: aload_2
>          53: monitorexit
>          54: goto          64
>          57: astore        4
>          59: aload_2
>          60: monitorexit
>          61: aload         4
>          63: athrow
>          64: aload_1
>          65: instanceof    #38                 // class sun/misc/Cleaner
>          68: ifeq          81
>          71: aload_1
>          72: checkcast     #38                 // class sun/misc/Cleaner
>          75: invokevirtual #67                 // Method
> sun/misc/Cleaner.clean:()V
>          78: goto          0
>          81: aload_1
>          82: getfield      #57                 // Field
> java/lang/ref/Reference.queue:Ljava/lang/ref/ReferenceQueue;
>          85: astore_2
>          86: aload_2
>          87: getstatic     #58                 // Field
> java/lang/ref/ReferenceQueue.NULL:Ljava/lang/ref/ReferenceQueue;
>          90: if_acmpeq     99
>          93: aload_2
>          94: aload_1
>          95: invokevirtual #66                 // Method
> java/lang/ref/ReferenceQueue.enqueue:(Ljava/lang/ref/Reference;)Z
>          98: pop
>          99: goto          0
>        Exception table:
>           from    to  target type
> *            33    39    42   Class java/lang/OutOfMemoryError*
>              33    43    46   Class java/lang/InterruptedException
>               6    49    57   any
>              52    54    57   any
>              57    61    57   any
>        StackMapTable: number_of_entries = 11
>             frame_type = 0 /* same */
>             frame_type = 253 /* append */
>            offset_delta = 32
>            locals = [ top, class java/lang/Object ]
>               frame_type = 72 /* same_locals_1_stack_item */
>              stack = [ class java/lang/OutOfMemoryError ]
>               frame_type = 0 /* same */
>               frame_type = 66 /* same_locals_1_stack_item */
>              stack = [ class java/lang/InterruptedException ]
>               frame_type = 0 /* same */
>               frame_type = 255 /* full_frame */
>              offset_delta = 4
>              locals = [ class java/lang/ref/Reference$ReferenceHandler,
> class java/lang/ref/Reference, class java/lang/Object ]
>              stack = []
>               frame_type = 255 /* full_frame */
>              offset_delta = 4
>              locals = [ class java/lang/ref/Reference$ReferenceHandler,
> top, class java/lang/Object ]
>              stack = [ class java/lang/Throwable ]
>               frame_type = 255 /* full_frame */
>              offset_delta = 6
>              locals = [ class java/lang/ref/Reference$ReferenceHandler,
> class java/lang/ref/Reference ]
>              stack = []
>               frame_type = 16 /* same */
>               frame_type = 250 /* chop */
>              offset_delta = 17
>
>          LineNumberTable:
>            line 136: 0
>            line 137: 6
>            line 138: 12
>            line 139: 16
>            line 140: 24
>            line 157: 33
>            line 158: 39
>            line 159: 43
>            line 160: 47
>            line 162: 52
>            line 165: 64
>            line 166: 71
>            line 167: 78
>            line 170: 81
>            line 171: 86
>            line 172: 99
>    }
>
>
> ... if I understand correctly the TraceExceptions' output, one record is
> printed for each abruptly completed method. The last record printed
> shows that Object.wait() method completed abruptly by throwing OOME.
> Even if the OOME was caught and ignored, we would still get this
> TraceExceptions record. Am I right? There's no record of abruptly
> terminated ReferenceHandler.run() method in the output. This might be
> because there's no method "calling" the ReferenceHandler.run() method,
> since it is the 1st method on the call-stack. Are you sure the output is
> from a failed test?
>
> Can we make the following adjustment to the ReferenceHandler and see
> what gets printed and/or whether we still get uncaught OOME:
>
>
>      private static class ReferenceHandler extends Thread {
>      ...
>          public void run() {
>              runImpl();
>          }
>
>          private void runImpl() {
>              // ... the content of the original run() method
>          }
>      ...
>
>
>
>
> Regards, Peter
>
>> David
>>
>>> David
>>> -----
>>>
>>>> ------------------- Excerpt Begins ---------------------
>>>>
>>>> 147  if (!gc_overhead_limit_was_exceeded) {
>>>> 148    // -XX:+HeapDumpOnOutOfMemoryError and -XX:OnOutOfMemoryError
>>>> support
>>>> 149    report_java_out_of_memory("Java heap space");
>>>> 150
>>>> 151    if (JvmtiExport::should_post_resource_exhausted()) {
>>>> 152      JvmtiExport::post_resource_exhausted(
>>>> 153        JVMTI_RESOURCE_EXHAUSTED_OOM_ERROR |
>>>> JVMTI_RESOURCE_EXHAUSTED_JAVA_HEAP,
>>>> 154        "Java heap space");
>>>> 155    }
>>>> 156
>>>> 157    THROW_OOP_0(Universe::out_of_memory_error_java_heap());
>>>> 158  } else {
>>>>
>>>> ------------------- Excerpt Ends ---------------------
>>>>
>>>>
>>>> Would be helpful if David/some one else in the team could explain the
>>>> latent aspects/probable cause.
>>>>
>>>> ---
>>>> Thanks
>>>> kalyan
>>>>
>>>> On 01/06/2014 04:40 PM, David Holmes wrote:
>>>>> Back from vacation ...
>>>>>
>>>>> On 20/12/2013 4:49 PM, David Holmes wrote:
>>>>>> On 20/12/2013 12:57 PM, srikalyan chandrashekar wrote:
>>>>>>> Hi David Thanks for your comments, the unguarded part(clean and
>>>>>>> enqueue)
>>>>>>> in the Reference Handler thread does not seem to create any new
>>>>>>> objects,
>>>>>>> so it is the application(the test in this case) which is adding
>>>>>>> objects
>>>>>>> to heap and causing the Reference Handler to die with OOME.
>>>>>>
>>>>>> The ReferenceHandler thread can only get OOME if it allocates
>>>>>> (directly
>>>>>> or indirectly) - so there has to be something in the unguarded part
>>>>>> that
>>>>>> causes this. Again it may be an implicit action in the VM -
>>>>>> similar to
>>>>>> the class load issue for InterruptedException.
>>>>>
>>>>> Run a debug VM with -XX:+TraceExceptions to see where the OOME is
>>>>> triggered.
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> David
>>>>>>
>>>>>> I am still
>>>>>>> unsure about the side effects of the code change and agree with your
>>>>>>> thoughts(on memory exhaustion test's reliability).
>>>>>>>
>>>>>>> PS: hotspot dev alias removed from CC.
>>>>>>>
>>>>>>> --
>>>>>>> Thanks
>>>>>>> kalyan
>>>>>>>
>>>>>>> On 12/19/13 5:08 PM, David Holmes wrote:
>>>>>>>> Hi Kalyan,
>>>>>>>>
>>>>>>>> This is not a hotspot issue so I'm moving this to core-libs, please
>>>>>>>> drop hotspot from any replies.
>>>>>>>>
>>>>>>>> On 20/12/2013 6:26 AM, srikalyan wrote:
>>>>>>>>> Hi all,  I have been working on the bug JDK-8022321
>>>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8022321> , this is a
>>>>>>>>> sporadic
>>>>>>>>> failure and the webrev is available here
>>>>>>>>> http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> I'm really not sure what to make of this. We have a test that
>>>>>>>> triggers
>>>>>>>> an out-of-memory condition but the OOME can actually turn up in the
>>>>>>>> ReferenceHandler thread causing it to terminate and the test to
>>>>>>>> fail.
>>>>>>>> We previously accounted for the non-obvious occurrences of OOME
>>>>>>>> due to
>>>>>>>> the Object.wait and the possible need to load the
>>>>>>>> InterruptedException
>>>>>>>> class - but still the OOME can appear where we don't want it. So
>>>>>>>> finally you have just placed the whole for(;;) loop in a
>>>>>>>> try/catch(OOME) that ignores the OOME. I'm certain that makes the
>>>>>>>> test
>>>>>>>> happy, but I'm not sure it is really what we want for the
>>>>>>>> ReferenceHandler thread. If the OOME occurs while cleaning, or
>>>>>>>> enqueuing then we will fail to clean and/or enqueue but there
>>>>>>>> would be
>>>>>>>> no indication that has occurred and I think that is a bigger
>>>>>>>> problem
>>>>>>>> than this test failing.
>>>>>>>>
>>>>>>>> There may be no way to make this test 100% reliable. In fact I'd
>>>>>>>> suggest that no memory exhaustion test can be 100% reliable.
>>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>>> *
>>>>>>>>> **"Root Cause:Still not known"*
>>>>>>>>> 2 places where there is a possibility for OOME
>>>>>>>>> 1) Cleaner.clean()
>>>>>>>>> 2) ReferenceQueue.enqueue()
>>>>>>>>>
>>>>>>>>> 1)  The cleanup code in turn has 2 places where there is potential
>>>>>>>>> for
>>>>>>>>> throwing OOME,
>>>>>>>>>      a) thunk Thread which is run from clean() method. This
>>>>>>>>> Runnable is
>>>>>>>>> passed to Cleaner and appears in the following classes
>>>>>>>>>          java/nio/DirectByteBuffer.java
>>>>>>>>>          sun/misc/Perf.java
>>>>>>>>>          sun/nio/fs/NativeBuffer.java
>>>>>>>>>          sun/nio/ch/IOVecWrapper.java
>>>>>>>>>          sun/misc/Cleaner/ExitOnThrow.java
>>>>>>>>> However none of the above overridden implementations ever
>>>>>>>>> create an
>>>>>>>>> object in the clean() code.
>>>>>>>>>      b) new PrivilegedAction created in try catch Exception
>>>>>>>>> block of
>>>>>>>>> clean() method but for this object to be created and to be held
>>>>>>>>> responsible for OOME an Exception(other than OOME) has to be
>>>>>>>>> thrown.
>>>>>>>>>
>>>>>>>>> 2) No new heap objects are created in the enqueue method nor
>>>>>>>>> anywhere in
>>>>>>>>> the deep call stack (VM.addFinalRefCount() etc) so this cannot
>>>>>>>>> be a
>>>>>>>>> potential cause.
>>>>>>>>>
>>>>>>>>> *Experimental change to java.lang.Reference.java* :
>>>>>>>>> - Put one more guard (try catch with OOME block) in the Reference
>>>>>>>>> Handler Thread which may give the Reference Handler a chance to
>>>>>>>>> cleanup.
>>>>>>>>> This is fixing the test failure (several 1000 runs with 0
>>>>>>>>> failures)
>>>>>>>>> - Without the above change the test fails atleast 3-5 times for
>>>>>>>>> every
>>>>>>>>> 1000 run.
>>>>>>>>>
>>>>>>>>> *PS*: The code change is to a very critical part of JDK and i am
>>>>>>>>> fully
>>>>>>>>> not aware of the consequences of the change, hence seeking expert
>>>>>>>>> help
>>>>>>>>> here. Appreciate your time and inputs towards this.
>>>>>>>>>
>>>>>>>
>>>>
>



More information about the core-libs-dev mailing list