Parallel GC Thread crash

Sundara Mohan M m.sundar85 at gmail.com
Fri Feb 28 18:57:12 UTC 2020


Hi Stefan,
    I tried running with -XX:+VerifyBeforeGC and -XX:+VerifyAfterGC but it
seems some of the operations are timing out (ex. ssl connect not sure if i
have very low timeout or this flag increases the latency)

Also from the crash error log i see following
...
  0x00007f0588066000 GCTaskThread "ParGC Thread#20" [stack:
0x00007ef5c4d13000,0x00007ef5c4e13000] [id=87534]

*=>0x00007f0588068000 GCTaskThread "ParGC Thread#21" [stack:
0x00007ef5c4c12000,0x00007ef5c4d12000] [id=87535]*  0x00007f0588069800
GCTaskThread "ParGC Thread#22" [stack:
0x00007ef5c4b11000,0x00007ef5c4c11000] [id=87536]
...
Threads with active compile tasks:


*VM state:at safepoint (normal execution)*

VM Mutex/Monitor currently owned by a thread:  ([mutex/lock_event])
[0x00007f0588015750] Threads_lock - owner thread: 0x00007f0588112800
[0x00007f0588016350] Heap_lock - owner thread: 0x00007ef4100f9000

Does this mean VM is in safepoint and executing GC operation or JVM related
activity which requires to be in safepoint (ie. not executing user code)
when it crashed?
I am trying to see if library or application code is causing this behaviour.

Thanks
Sundar

On Mon, Feb 10, 2020 at 3:13 PM Stefan Karlsson <stefan.karlsson at oracle.com>
wrote:

> On 2020-02-10 20:53, Sundara Mohan M wrote:
> > Hi Stefan,
> >     Yes we are trying to move to 13.0.2. Wanted to verify if anyone
> > else seen this or upgrading will really solve this problem.
> >
> > Can you share how to file a bug report for this? I don't have access
> > to https://bugs.openjdk.java.net/
>
> There are directions in the hs_err crash file that points you to the web
> page to file a bug.
>
> You seem to be running AdoptJDK builds so your bug reports would end up
> at their system:
>  >     > # If you would like to submit a bug report, please visit:
>  >     > # https://github.com/AdoptOpenJDK/openjdk-build/issues
>
>
> If you were running with Oracle binaries you would get lines like this:
> # If you would like to submit a bug report, please visit:
> #   https://bugreport.java.com/bugreport/crash.jsp
>
> >
> > I will try to run with -XX:+VerifyBeforeGC and -XX:+VerifyAfterGC to
> > get more information.
>
> OK. Hopefully this gives us more information.
>
> StefanK
> >
> >
> > Thanks
> > Sundar
> >
> > On Mon, Feb 10, 2020 at 2:42 PM Stefan Karlsson
> > <stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>> wrote:
> >
> >     Hi Sundar,
> >
> >     On 2020-02-10 19:32, Sundara Mohan M wrote:
> >     > Hi Stefan,
> >     >     We started seeing more crashes on JDK13.0.1+9
> >     >
> >     > Since seeing it on GC Task Thread assumed it is related to GC.
> >
> >     As I said in my previous mail, I don't think this is caused by GC
> >     code.
> >     More below.
> >
> >     >
> >     > # Problematic frame:
> >     > # V  [libjvm.so+0xd183c0]
> >      PSRootsClosure<false>::do_oop(oopDesc**)+0x30
> >     >
> >     > Command Line: -XX:+AlwaysPreTouch -Xms64000m -Xmx64000m
> >     > -XX:NewSize=40000m -XX:+DisableExplicitGC -Xnoclassgc
> >     > -XX:+UseParallelGC -XX:ParallelGCThreads=40 -XX:ConcGCTh
> >     > reads=5 ...
> >     >
> >     > Host: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz, 48 cores, 125G,
> >     Red
> >     > Hat Enterprise Linux Server release 6.10 (Santiago)
> >     > Time: Fri Feb  7 11:15:04 2020 UTC elapsed time: 286290 seconds
> >     (3d 7h
> >     > 31m 30s)
> >     >
> >     > ---------------  T H R E A D  ---------------
> >     >
> >     > Current thread (0x00007fca6c074000):  GCTaskThread "ParGC
> >     Thread#28"
> >     > [stack: 0x00007fba72ff1000,0x00007fba730f1000] [id=56530]
> >     >
> >     > Stack: [0x00007fba72ff1000,0x00007fba730f1000],
> >     >  sp=0x00007fba730ee850,  free space=1014k
> >     > Native frames: (J=compiled Java code, A=aot compiled Java code,
> >     > j=interpreted, Vv=VM code, C=native code)
> >     > V  [libjvm.so+0xd183c0]
> >      PSRootsClosure<false>::do_oop(oopDesc**)+0x30
> >     > V  [libjvm.so+0xc6bf0b]  OopMapSet::oops_do(frame const*,
> >     RegisterMap
> >     > const*, OopClosure*)+0x2eb
> >     > V  [libjvm.so+0x765489]  frame::oops_do_internal(OopClosure*,
> >     > CodeBlobClosure*, RegisterMap*, bool)+0x99
> >     > V  [libjvm.so+0xf68b17]  JavaThread::oops_do(OopClosure*,
> >     > CodeBlobClosure*)+0x187
> >     > V  [libjvm.so+0xd190be]  ThreadRootsTask::do_it(GCTaskManager*,
> >     > unsigned int)+0x6e
> >     > V  [libjvm.so+0x7f422b]  GCTaskThread::run()+0x1eb
> >     > V  [libjvm.so+0xf707fd]  Thread::call_run()+0x10d
> >     > V  [libjvm.so+0xc875b7]  thread_native_entry(Thread*)+0xe7
> >     >
> >     > JavaThread 0x00007fb8f4036800 (nid = 60927) was being processed
> >     > Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> >     > v  ~RuntimeStub::_new_array_Java
> >     > J 58520 c2
> >     >
> >
>  ch.qos.logback.classic.spi.ThrowableProxy.<init>(Ljava/lang/Throwable;)V
> >
> >     > (207 bytes) @ 0x00007fca5fd23dec
> >     [0x00007fca5fd1dbc0+0x000000000000622c]
> >     > J 66864 c2 webservice.exception.ExceptionLoggingWrapper.execute()V
> >     > (1004 bytes) @ 0x00007fca60c02588
> >     [0x00007fca60bffce0+0x00000000000028a8]
> >     > J 58224 c2
> >     >
> >
>  webservice.exception.mapper.AbstractExceptionMapper.toResponse(Lbeans/exceptions/mapper/V3ErrorCode;Ljava/lang/Exception;)Ljavax/ws/rs/core/Response;
> >
> >     > (105 bytes) @ 0x00007fca5f59bad8
> >     [0x00007fca5f59b880+0x0000000000000258]
> >     > J 69992 c2
> >     >
> >
>  webservice.exception.mapper.JediRequestBlockedExceptionMapper.toResponse(Ljava/lang/Exception;)Ljavax/ws/rs/core/Response;
> >
> >     > (9 bytes) @ 0x00007fca5e1019f4
> >     [0x00007fca5e101940+0x00000000000000b4]
> >     > J 55265 c2
> >     >
> >
>  webservice.filters.ResponseSerializationWorker.processException()Ljava/io/InputStream;
> >
> >     > (332 bytes) @ 0x00007fca5f6f58e0
> >     [0x00007fca5f6f5700+0x00000000000001e0]
> >     > J 483122 c2
> >     webservice.filters.ResponseSerializationWorker.execute()Z
> >     > (272 bytes) @ 0x00007fca622fc2b4
> >     [0x00007fca622fbc80+0x0000000000000634]
> >     > J 15811% c2
> >     >
> >
>  com.lafaspot.common.concurrent.internal.WorkerManagerOneThread.call()Lcom/lafaspot/common/concurrent/internal/WorkerManagerState;
> >
> >     > (486 bytes) @ 0x00007fca5c108794
> >     [0x00007fca5c1082a0+0x00000000000004f4]
> >     > j
> >     >
> >
>   com.lafaspot.common.concurrent.internal.WorkerManagerOneThread.call()Ljava/lang/Object;+1
> >     > J 4586 c1 java.util.concurrent.FutureTask.run()V
> >     java.base at 13.0.1 (123
> >     > bytes) @ 0x00007fca54d27184 [0x00007fca54d26b00+0x0000000000000684]
> >     > J 7550 c1
> >     >
> >
>  java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
> >
> >     > java.base at 13.0.1 (187 bytes) @ 0x00007fca54fbb6d4
> >     > [0x00007fca54fba8e0+0x0000000000000df4]
> >     > J 7549 c1 java.util.concurrent.ThreadPoolExecutor$Worker.run()V
> >     > java.base at 13.0.1 (9 bytes) @ 0x00007fca5454b93c
> >     > [0x00007fca5454b8c0+0x000000000000007c]
> >     > J 4585 c1 java.lang.Thread.run()V java.base at 13.0.1 (17 bytes) @
> >     > 0x00007fca54d250f4 [0x00007fca54d24fc0+0x0000000000000134]
> >     > v  ~StubRoutines::call_stub
> >     >
> >     > siginfo: si_signo: 11 (SIGSEGV), si_code: 128 (SI_KERNEL), si_addr:
> >     > 0x0000000000000000
> >     >
> >     > Does JDK11 and 13 have different code for GC. Do you think
> >     > downgrading(JDK11 stable)/upgrading(JDK-13.0.2) might help here?
> >
> >     You should at least move to 13.0.2, to get the latest bug
> >     fixes/patches.
> >
> >     There has been a lot of changes in all areas of the JVM between 11
> >     and
> >     13. We don't yet know the root cause of this crash, and I can't
> >     say if
> >     this is caused by new changes or not. Have you or anyone filed a bug
> >     report for this?
> >
> >     > Any insight to debug this will be helpful.
> >
> >     Did you try my previous suggestion to run with -XX:+VerifyBeforeGC
> >     and
> >     -XX:+VerifyAfterGC? If you can tolerate the longer GC times it
> >     introduces, then you could try to run with
> >     -XX:+UnlockDiagnosticVMOptions -XX:+VerifyBeforeGC
> >     -XX:+VerifyAfterGC .
> >
> >     Cheers,
> >     StefanK
> >
> >     >
> >     > TIA
> >     > Sundar
> >     >
> >     > On Tue, Feb 4, 2020 at 5:47 AM Stefan Karlsson
> >     > <stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>
> >     <mailto:stefan.karlsson at oracle.com
> >     <mailto:stefan.karlsson at oracle.com>>> wrote:
> >     >
> >     >     Hi Sundar,
> >     >
> >     >     The GC crashes when it encounters something bad on the stack:
> >     >      > V  [libjvm.so+0xc6bf0b] OopMapSet::oops_do(frame const*,
> >     >     RegisterMap
> >     >      > const*, OopClosure*)+0x2eb
> >     >      > V  [libjvm.so+0x765489] frame::oops_do_internal(OopClosure*,
> >     >
> >     >     This is probably not a GC bug. It's more likely that this is
> >     >     caused by
> >     >     the JIT compiler. I see in your hotspot-runtime-dev thread,
> >     that you
> >     >     also get crashes in other compiler related areas.
> >     >
> >     >     If you want to rule out the GC, you can run with
> >     >     -XX:+VerifyBeforeGC and
> >     >     -XX:+VerifyAfterGC, and see if this asserts before the GC
> >     has started
> >     >     running.
> >     >
> >     >     StefanK
> >     >
> >     >     On 2020-02-04 04:38, Sundara Mohan M wrote:
> >     >     > Hi,
> >     >     >     I am seeing following crashes frequently on our servers
> >     >     > #
> >     >     > # A fatal error has been detected by the Java Runtime
> >     Environment:
> >     >     > #
> >     >     > #  SIGSEGV (0xb) at pc=0x00007fca3281d311, pid=103575,
> >     tid=108299
> >     >     > #
> >     >     > # JRE version: OpenJDK Runtime Environment (13.0.1+9) (build
> >     >     13.0.1+9)
> >     >     > # Java VM: OpenJDK 64-Bit Server VM (13.0.1+9, mixed mode,
> >     >     tiered, parallel
> >     >     > gc, linux-amd64)
> >     >     > # Problematic frame:
> >     >     > # V  [libjvm.so+0xcd3311]
> >     >     PCMarkAndPushClosure::do_oop(oopDesc**)+0x51
> >     >     > #
> >     >     > # No core dump will be written. Core dumps have been
> disabled.
> >     >     To enable
> >     >     > core dumping, try "ulimit -c unlimited" before starting
> >     Java again
> >     >     > #
> >     >     > # If you would like to submit a bug report, please visit:
> >     >     > # https://github.com/AdoptOpenJDK/openjdk-build/issues
> >     >     > #
> >     >     >
> >     >     >
> >     >     > ---------------  T H R E A D  ---------------
> >     >     >
> >     >     > Current thread (0x00007fca2c051000): GCTaskThread "ParGC
> >     >     Thread#8" [stack:
> >     >     > 0x00007fca30277000,0x00007fca30377000] [id=108299]
> >     >     >
> >     >     > Stack: [0x00007fca30277000,0x00007fca30377000],
> >     >     sp=0x00007fca30374890,
> >     >     >   free space=1014k
> >     >     > Native frames: (J=compiled Java code, A=aot compiled Java
> >     code,
> >     >     > j=interpreted, Vv=VM code, C=native code)
> >     >     > V  [libjvm.so+0xcd3311]
> >     PCMarkAndPushClosure::do_oop(oopDesc**)+0x51
> >     >     > V  [libjvm.so+0xc6bf0b] OopMapSet::oops_do(frame const*,
> >     >     RegisterMap
> >     >     > const*, OopClosure*)+0x2eb
> >     >     > V  [libjvm.so+0x765489] frame::oops_do_internal(OopClosure*,
> >     >     > CodeBlobClosure*, RegisterMap*, bool)+0x99
> >     >     > V  [libjvm.so+0xf68b17] JavaThread::oops_do(OopClosure*,
> >     >     > CodeBlobClosure*)+0x187
> >     >     > V  [libjvm.so+0xcce2f0]
> >     >     ThreadRootsMarkingTask::do_it(GCTaskManager*,
> >     >     > unsigned int)+0xb0
> >     >     > V  [libjvm.so+0x7f422b] GCTaskThread::run()+0x1eb
> >     >     > V  [libjvm.so+0xf707fd] Thread::call_run()+0x10d
> >     >     > V  [libjvm.so+0xc875b7] thread_native_entry(Thread*)+0xe7
> >     >     >
> >     >     > JavaThread 0x00007fb85c004800 (nid = 111387) was being
> >     processed
> >     >     > Java frames: (J=compiled Java code, j=interpreted, Vv=VM
> code)
> >     >     > v  ~RuntimeStub::_new_array_Java
> >     >     > J 225122 c2
> >     >     >
> >     >
> >
>   ch.qos.logback.classic.spi.ThrowableProxy.<init>(Ljava/lang/Throwable;)V
> >     >     > (207 bytes) @ 0x00007fca21f1a5d8
> >     >     [0x00007fca21f17f20+0x00000000000026b8]
> >     >     > J 62342 c2
> >     >  webservice.exception.ExceptionLoggingWrapper.execute()V (1004
> >     >     > bytes) @ 0x00007fca20f0aec8
> >     [0x00007fca20f07f40+0x0000000000002f88]
> >     >     > J 225129 c2
> >     >     >
> >     >
> >
>   webservice.exception.mapper.AbstractExceptionMapper.toResponse(Lbeans/exceptions/mapper/V3ErrorCode;Ljava/lang/Exception;)Ljavax/ws/rs/core/Response;
> >     >     > (105 bytes) @ 0x00007fca1da512ac
> >     >     [0x00007fca1da51100+0x00000000000001ac]
> >     >     > J 131643 c2
> >     >     >
> >     >
> >
>   webservice.exception.mapper.RequestBlockedExceptionMapper.toResponse(Ljava/lang/Exception;)Ljavax/ws/rs/core/Response;
> >     >     > (9 bytes) @ 0x00007fca20ce6190
> >     >     [0x00007fca20ce60c0+0x00000000000000d0]
> >     >     > J 55114 c2
> >     >     >
> >     >
> >
>   webservice.filters.ResponseSerializationWorker.processException()Ljava/io/InputStream;
> >     >     > (332 bytes) @ 0x00007fca2051fe64
> >     >     [0x00007fca2051f820+0x0000000000000644]
> >     >     > J 57859 c2
> >     >  webservice.filters.ResponseSerializationWorker.execute()Z (272
> >     >     > bytes) @ 0x00007fca1ef2ed18
> >     [0x00007fca1ef2e140+0x0000000000000bd8]
> >     >     > J 16114% c2
> >     >     >
> >     >
> >
>   com.lafaspot.common.concurrent.internal.WorkerManagerOneThread.call()Lcom/lafaspot/common/concurrent/internal/WorkerManagerState;
> >     >     > (486 bytes) @ 0x00007fca1ced465c
> >     >     [0x00007fca1ced4200+0x000000000000045c]
> >     >     > j
> >     >     >
> >     >
> >
>    com.lafaspot.common.concurrent.internal.WorkerManagerOneThread.call()Ljava/lang/Object;+1
> >     >     > J 11639 c2 java.util.concurrent.FutureTask.run()V
> >     >     java.base at 13.0.1 (123
> >     >     > bytes) @ 0x00007fca1cd00858
> >     [0x00007fca1cd007c0+0x0000000000000098]
> >     >     > J 7560 c1
> >     >     >
> >     >
> >
>   java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
> >     >     > java.base at 13.0.1 (187 bytes) @ 0x00007fca15b23f54
> >     >     > [0x00007fca15b23160+0x0000000000000df4]
> >     >     > J 5143 c1
> >     java.util.concurrent.ThreadPoolExecutor$Worker.run()V
> >     >     > java.base at 13.0.1 (9 bytes) @ 0x00007fca15b39abc
> >     >     > [0x00007fca15b39a40+0x000000000000007c]
> >     >     > J 4488 c1 java.lang.Thread.run()V java.base at 13.0.1 (17
> >     bytes) @
> >     >     > 0x00007fca159fc174 [0x00007fca159fc040+0x0000000000000134]
> >     >     > v  ~StubRoutines::call_stub
> >     >     >
> >     >     > siginfo: si_signo: 11 (SIGSEGV), si_code: 128 (SI_KERNEL),
> >     si_addr:
> >     >     > 0x0000000000000000
> >     >     >
> >     >     > Register to memory mapping:
> >     >     > ...
> >     >     >
> >     >     > Can someone shed more info on when this can happen? I am
> >     seeing
> >     >     this on
> >     >     > multiple servers with Java 13.0.1+9 on RHEL6 servers.
> >     >     >
> >     >     > There was another thread in hotspot runtime where David
> Holmes
> >     >     pointed this
> >     >     >> siginfo: si_signo: 11 (SIGSEGV), si_code: 128
> >     (SI_KERNEL), si_addr:
> >     >     > 0x0000000000000000
> >     >     >
> >     >     >> This seems it may be related to:
> >     >     >> https://bugs.openjdk.java.net/browse/JDK-8004124
> >     >     >
> >     >     > Just wondering if this is same or something to do with GC
> >     specific.
> >     >     >
> >     >     >
> >     >     >
> >     >     > TIA
> >     >     > Sundar
> >     >     >
> >     >
> >
>
>



More information about the hotspot-gc-dev mailing list