Need help on debugging JVM crash

Yumin Qi yumin.qi at oracle.com
Tue Mar 3 18:49:06 UTC 2020


Hi, Sundara

As suggested by Stefan in another thread i tried to
>
>>     add VerifyAfterGC/VerifyBeforeGC but it seems to increase the latency and
>>     applications not surviving our production traffic(timing out and requests
>>     are failing).
>>
>>     Questions
>>     1. When i looked at source code for printing stack trace i see following
>>     https://github.com/openjdk/jdk11u/blob/master/src/hotspot/share/utilities/vmError.cpp#L696  <https://urldefense.com/v3/__https://github.com/openjdk/jdk11u/blob/master/src/hotspot/share/utilities/vmError.cpp*L696__;Iw!!GqivPVa7Brio!IRry1mpW17OpO0UvNnbA0kT-vFI5Ys8q9O2w_xDoLOo_WVlijCqYZGGj5nBsEQg$>
>>     (Prints native stack trace)
>>     https://github.com/openjdk/jdk11u/blob/master/src/hotspot/share/utilities/vmError.cpp#L718  <https://urldefense.com/v3/__https://github.com/openjdk/jdk11u/blob/master/src/hotspot/share/utilities/vmError.cpp*L718__;Iw!!GqivPVa7Brio!IRry1mpW17OpO0UvNnbA0kT-vFI5Ys8q9O2w_xDoLOo_WVlijCqYZGGjPduI4BU$>
>>     (printing Java thread stack trace if it is involved in GC crash)
>>        a. How do you know this java thread was involved in jvm crash?
>     When GC processes thread stack as root, the java thread first was
>     recorded. This is why at crash, the java thread was printed out.
>>        b. Can i assume the java thread printed after native stack trace was the
>>     culprit?
>
>     Please check this thread stack frames, when GC is doing marking
>     work, I think, it encountered a bad oop. Check:
>
>     If it is a compiled frame, if so, it may related to compiled code.
>
>>        c. Since i am seeing the same frame (~RuntimeStub::_new_array_Java, J
>>     54174 c2 ch.qos.logback.classic.spi.ThrowableProxy.<init>..) but different
>>     stack trace in both crashes can this be the root cause?
>
>     It is a C2 compiled frame. The bad oop could be a result of compiler.
>
> Actually the top two frame are always same in different crashes
> v ~RuntimeStub::_new_array_Java
> J 54174 c2 
> ch.qos.logback.classic.spi.ThrowableProxy.<init>(Ljava/lang/Throwable;)V 
> (207 bytes) @ 0x00007f6687d92678 [0x00007f6687d8c700+0x0000000000005f78]
> In this case do you think JVM code(frame 1) or C2 compiler code(frame 
> 2) might be issue?
> Is there any way to identify that and what kind of debug 
> flags/settings might give us this information?
>
>     It also needs detail debug information to make the conclusion.
>
> Do you think any of the information dumped in hs_err* file might give 
> us more info (like registers content/Instructions/core file)?
>
> Can you please let me know what additional details might help to make 
> the conclusion? Also how to get those information?
>
If it is caused by this compiled java method, excluding the java method 
from compilation is a workaround.

You can switch to the java thread (the printed out java thread at 
crash), compare the failed frame in GC thread to the frame in the java 
thread so you will know which frame contained bad oop. Also know what is 
the frame, compiled, interpreter, or native.


Yumin


>     Thanks
>
>     Yumin
>
>>     2. Thinking of excluding compilation
>>     of ch.qos.logback.classic.spi.ThrowableProxy class and running in
>>     production to see if compilation of this method is the cause. Does it make
>>     sense?
>>
>>     3. Any other suggestion on debugging this further?
>>
>>     TIA
>>     Sundar
>
>
> Thanks
> Sundar



More information about the hotspot-gc-dev mailing list