Need help on debugging JVM crash
Sundara Mohan M
m.sundar85 at gmail.com
Tue Mar 3 16:02:24 UTC 2020
Hi,
I am seeing JVM crashes on our system in GC Thread with parallel gc on
x86 linux. Observed the same crash happening on JVM-11.0.6/13.0.2/13.0.1 GA
builds.
Adding some logs lines to give some context.
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f669c964311, pid=66684, tid=71106
#
# JRE version: OpenJDK Runtime Environment (13.0.1+9) (build 13.0.1+9)
# Java VM: OpenJDK 64-Bit Server VM (13.0.1+9, mixed mode, tiered, parallel
gc, linux-amd64)
# Problematic frame:
# V [libjvm.so+0xcd3311] PCMarkAndPushClosure::do_oop(oopDesc**)+0x51
#
# No core dump will be written. Core dumps have been disabled. To enable
core dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
# https://github.com/AdoptOpenJDK/openjdk-build/issues
#
Host: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz, 48 cores, 125G, Red Hat
Enterprise Linux Server release 6.10 (Santiago)
Time: Thu Feb 6 11:43:48 2020 UTC elapsed time: 198626 seconds (2d 7h 10m
26s)
Following is the stack trace
ex1:
Stack: [0x00007fd01cbdb000,0x00007fd01ccdb000], sp=0x00007fd01ccd8890,
free space=1014k
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
*V [libjvm.so+0xcc0121] PCMarkAndPushClosure::do_oop(oopDesc**)+0x51*
V [libjvm.so+0xc58c8b] OopMapSet::oops_do(frame const*, RegisterMap
const*, OopClosure*)+0x2eb
V [libjvm.so+0x7521e9] frame::oops_do_internal(OopClosure*,
CodeBlobClosure*, RegisterMap*, bool)+0x99
V [libjvm.so+0xf55757] JavaThread::oops_do(OopClosure*,
CodeBlobClosure*)+0x187
V [libjvm.so+0xcbb100] ThreadRootsMarkingTask::do_it(GCTaskManager*,
unsigned int)+0xb0
V [libjvm.so+0x7e0f8b] GCTaskThread::run()+0x1eb
V [libjvm.so+0xf5d43d] Thread::call_run()+0x10d
V [libjvm.so+0xc74337] thread_native_entry(Thread*)+0xe7
JavaThread 0x00007fbeb9209800 (nid = 82380) was being processed
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
v ~RuntimeStub::_new_array_Java
J 62465 c2
ch.qos.logback.classic.spi.ThrowableProxy.<init>(Ljava/lang/Throwable;)V
(207 bytes) @ 0x00007fd00ad43704 [0x00007fd00ad41420+0x00000000000022e4]
J 474206 c2
org.eclipse.jetty.util.log.JettyAwareLogger.log(Lorg/slf4j/Marker;ILjava/lang/String;[Ljava/lang/Object;Ljava/lang/Throwable;)V
(134 bytes) @ 0x00007fd00c4e81ec [0x00007fd00c4e7ee0+0x000000000000030c]
j
org.eclipse.jetty.util.log.JettyAwareLogger.warn(Ljava/lang/String;Ljava/lang/Throwable;)V+7
j
org.eclipse.jetty.util.log.Slf4jLog.warn(Ljava/lang/String;Ljava/lang/Throwable;)V+6
j
org.eclipse.jetty.server.HttpChannel.handleException(Ljava/lang/Throwable;)V+181
j
org.eclipse.jetty.server.HttpChannelOverHttp.handleException(Ljava/lang/Throwable;)V+13
J 64106 c2 org.eclipse.jetty.server.HttpChannel.handle()Z (997 bytes) @
0x00007fd00c6d2cd4 [0x00007fd00c6cdec0+0x0000000000004e14]
J 280430 c2 org.eclipse.jetty.server.HttpConnection.onFillable()V (334
bytes) @ 0x00007fd00da925f0 [0x00007fd00da91e40+0x00000000000007b0]
J 41979 c2 org.eclipse.jetty.io.ChannelEndPoint$2.run()V (12 bytes) @
0x00007fd00a14f604 [0x00007fd00a14f4e0+0x0000000000000124]
J 86362 c2 org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run()V
(565 bytes) @ 0x00007fd0087d7e34 [0x00007fd0087d7cc0+0x0000000000000174]
J 75998 c2 java.lang.Thread.run()V java.base at 13.0.2 (17 bytes) @
0x00007fd00c93b8d8 [0x00007fd00c93b8a0+0x0000000000000038]
v ~StubRoutines::call_stub
ex2:
Stack: [0x00007f669869f000,0x00007f669879f000], sp=0x00007f669879c890,
free space=1014k
Native frames: (J=compiled Java code, A=aot compiled Java code,
j=interpreted, Vv=VM code, C=native code)
*V [libjvm.so+0xcd3311] PCMarkAndPushClosure::do_oop(oopDesc**)+0x51*V
[libjvm.so+0xc6bf0b] OopMapSet::oops_do(frame const*, RegisterMap const*,
OopClosure*)+0x2eb
V [libjvm.so+0x765489] frame::oops_do_internal(OopClosure*,
CodeBlobClosure*, RegisterMap*, bool)+0x99
V [libjvm.so+0xf68b17] JavaThread::oops_do(OopClosure*,
CodeBlobClosure*)+0x187
V [libjvm.so+0xcce2f0] ThreadRootsMarkingTask::do_it(GCTaskManager*,
unsigned int)+0xb0
V [libjvm.so+0x7f422b] GCTaskThread::run()+0x1eb
V [libjvm.so+0xf707fd] Thread::call_run()+0x10d
V [libjvm.so+0xc875b7] thread_native_entry(Thread*)+0xe7
JavaThread 0x00007f5518004000 (nid = 75659) was being processed
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
v ~RuntimeStub::_new_array_Java
J 54174 c2
ch.qos.logback.classic.spi.ThrowableProxy.<init>(Ljava/lang/Throwable;)V
(207 bytes) @ 0x00007f6687d92678 [0x00007f6687d8c700+0x0000000000005f78]
J 334031 c2
com.xmas.webservice.exception.ExceptionLoggingWrapper.execute()V (1004
bytes) @ 0x00007f6686ede430 [0x00007f6686edd580+0x0000000000000eb0]
J 53431 c2
com.xmas.webservice.exception.mapper.AbstractExceptionMapper.toResponse(Lcom/xmas/beans/exceptions/mapper/V3ErrorCode;Ljava/lang/Exception;)Ljavax/ws/rs/core/Response;
(105 bytes) @ 0x00007f6687db88b0 [0x00007f6687db8660+0x0000000000000250]
J 63819 c2
com.xmas.webservice.exception.mapper.RequestBlockedExceptionMapper.toResponse(Ljava/lang/Exception;)Ljavax/ws/rs/core/Response;
(9 bytes) @ 0x00007f6686a6ed9c [0x00007f6686a6ecc0+0x00000000000000dc]
J 334032 c2
com.xmas.webservice.filters.ResponseSerializationWorker.processException()Ljava/io/InputStream;
(332 bytes) @ 0x00007f668992ad34 [0x00007f668992a840+0x00000000000004f4]
J 403918 c2
com.xmas.webservice.filters.ResponseSerializationWorker.execute()Z (272
bytes) @ 0x00007f66869d67fc [0x00007f66869d5e80+0x000000000000097c]
J 17530 c2 com.lafaspot.common.concurrent.internal.WorkerWrapper.execute()Z
(208 bytes) @ 0x00007f66848b3708 [0x00007f66848b36a0+0x0000000000000068]
J 31970% c2
com.lafaspot.common.concurrent.internal.WorkerManagerOneThread.call()Lcom/lafaspot/common/concurrent/internal/WorkerManagerState;
(486 bytes) @ 0x00007f668608dcb0 [0x00007f668608d5e0+0x00000000000006d0]
j
com.lafaspot.common.concurrent.internal.WorkerManagerOneThread.call()Ljava/lang/Object;+1
J 4889 c1 java.util.concurrent.FutureTask.run()V java.base at 13.0.1 (123
bytes) @ 0x00007f667d0be604 [0x00007f667d0bdf80+0x0000000000000684]
J 7487 c1
java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
java.base at 13.0.1 (187 bytes) @ 0x00007f667dd45854
[0x00007f667dd44a60+0x0000000000000df4]
J 7486 c1 java.util.concurrent.ThreadPoolExecutor$Worker.run()V
java.base at 13.0.1 (9 bytes) @ 0x00007f667d1f643c
[0x00007f667d1f63c0+0x000000000000007c]
J 7078 c1 java.lang.Thread.run()V java.base at 13.0.1 (17 bytes) @
0x00007f667d1f2d74 [0x00007f667d1f2c40+0x0000000000000134]
v ~StubRoutines::call_stub
Not very frequent but ~90 days ~120 crashes with following signal
siginfo: si_signo: 11 (SIGSEGV), si_code: 128 (SI_KERNEL), si_addr:
0x0000000000000000
This signal is generated when we try to access non canonical address in
linux.
As suggested by Stefan in another thread i tried to
add VerifyAfterGC/VerifyBeforeGC but it seems to increase the latency and
applications not surviving our production traffic(timing out and requests
are failing).
Questions
1. When i looked at source code for printing stack trace i see following
https://github.com/openjdk/jdk11u/blob/master/src/hotspot/share/utilities/vmError.cpp#L696
(Prints native stack trace)
https://github.com/openjdk/jdk11u/blob/master/src/hotspot/share/utilities/vmError.cpp#L718
(printing Java thread stack trace if it is involved in GC crash)
a. How do you know this java thread was involved in jvm crash?
b. Can i assume the java thread printed after native stack trace was the
culprit?
c. Since i am seeing the same frame (~RuntimeStub::_new_array_Java, J
54174 c2 ch.qos.logback.classic.spi.ThrowableProxy.<init>..) but different
stack trace in both crashes can this be the root cause?
2. Thinking of excluding compilation
of ch.qos.logback.classic.spi.ThrowableProxy class and running in
production to see if compilation of this method is the cause. Does it make
sense?
3. Any other suggestion on debugging this further?
TIA
Sundar
More information about the hotspot-gc-dev
mailing list