Parallel GC Thread crash

Stefan Karlsson stefan.karlsson at oracle.com
Fri Feb 28 19:18:15 UTC 2020


Hi Sundar,

On 2020-02-28 19:57, Sundara Mohan M wrote:
> Hi Stefan,
>     I tried running with -XX:+VerifyBeforeGC and -XX:+VerifyAfterGC 
> but it seems some of the operations are timing out (ex. ssl connect 
> not sure if i have very low timeout or this flag increases the latency)

The flag increase the latencies, because it runs extra verification 
checks in the pauses.

>
> Also from the crash error log i see following
> ...
>   0x00007f0588066000 GCTaskThread "ParGC Thread#20" [stack: 
> 0x00007ef5c4d13000,0x00007ef5c4e13000] [id=87534]
> *=>0x00007f0588068000 GCTaskThread "ParGC Thread#21" [stack: 
> 0x00007ef5c4c12000,0x00007ef5c4d12000] [id=87535]
> *  0x00007f0588069800 GCTaskThread "ParGC Thread#22" [stack: 
> 0x00007ef5c4b11000,0x00007ef5c4c11000] [id=87536]
> ...
> Threads with active compile tasks:
>
> *VM state:at safepoint (normal execution)
> *
>
> VM Mutex/Monitor currently owned by a thread:  ([mutex/lock_event])
> [0x00007f0588015750] Threads_lock - owner thread: 0x00007f0588112800
> [0x00007f0588016350] Heap_lock - owner thread: 0x00007ef4100f9000
>
> Does this mean VM is in safepoint and executing GC operation or JVM 
> related activity which requires to be in safepoint (ie. not executing 
> user code) when it crashed?

Yes, exactly. The Parallel GC does all work in a stop-the-world pause.

StefanK

> I am trying to see if library or application code is causing this 
> behaviour.
>
> Thanks
> Sundar
>
> On Mon, Feb 10, 2020 at 3:13 PM Stefan Karlsson 
> <stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>> wrote:
>
>     On 2020-02-10 20:53, Sundara Mohan M wrote:
>     > Hi Stefan,
>     >     Yes we are trying to move to 13.0.2. Wanted to verify if anyone
>     > else seen this or upgrading will really solve this problem.
>     >
>     > Can you share how to file a bug report for this? I don't have
>     access
>     > to https://bugs.openjdk.java.net/
>
>     There are directions in the hs_err crash file that points you to
>     the web
>     page to file a bug.
>
>     You seem to be running AdoptJDK builds so your bug reports would
>     end up
>     at their system:
>      >     > # If you would like to submit a bug report, please visit:
>      >     > # https://github.com/AdoptOpenJDK/openjdk-build/issues
>     <https://urldefense.com/v3/__https://github.com/AdoptOpenJDK/openjdk-build/issues__;!!GqivPVa7Brio!PT70czeXdrqB3tGHjnmJA2Ds3bxId1GpNsjq5FfqX84mHS8aVWbikUqEuOqvwXQ0vVOn$>
>
>
>     If you were running with Oracle binaries you would get lines like
>     this:
>     # If you would like to submit a bug report, please visit:
>     # https://bugreport.java.com/bugreport/crash.jsp
>     <https://urldefense.com/v3/__https://bugreport.java.com/bugreport/crash.jsp__;!!GqivPVa7Brio!PT70czeXdrqB3tGHjnmJA2Ds3bxId1GpNsjq5FfqX84mHS8aVWbikUqEuOqvwcxkrNHY$>
>
>     >
>     > I will try to run with -XX:+VerifyBeforeGC and
>     -XX:+VerifyAfterGC to
>     > get more information.
>
>     OK. Hopefully this gives us more information.
>
>     StefanK
>     >
>     >
>     > Thanks
>     > Sundar
>     >
>     > On Mon, Feb 10, 2020 at 2:42 PM Stefan Karlsson
>     > <stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>
>     <mailto:stefan.karlsson at oracle.com
>     <mailto:stefan.karlsson at oracle.com>>> wrote:
>     >
>     >     Hi Sundar,
>     >
>     >     On 2020-02-10 19:32, Sundara Mohan M wrote:
>     >     > Hi Stefan,
>     >     >     We started seeing more crashes on JDK13.0.1+9
>     >     >
>     >     > Since seeing it on GC Task Thread assumed it is related to GC.
>     >
>     >     As I said in my previous mail, I don't think this is caused
>     by GC
>     >     code.
>     >     More below.
>     >
>     >     >
>     >     > # Problematic frame:
>     >     > # V  [libjvm.so+0xd183c0]
>     >      PSRootsClosure<false>::do_oop(oopDesc**)+0x30
>     >     >
>     >     > Command Line: -XX:+AlwaysPreTouch -Xms64000m -Xmx64000m
>     >     > -XX:NewSize=40000m -XX:+DisableExplicitGC -Xnoclassgc
>     >     > -XX:+UseParallelGC -XX:ParallelGCThreads=40 -XX:ConcGCTh
>     >     > reads=5 ...
>     >     >
>     >     > Host: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz, 48 cores,
>     125G,
>     >     Red
>     >     > Hat Enterprise Linux Server release 6.10 (Santiago)
>     >     > Time: Fri Feb  7 11:15:04 2020 UTC elapsed time: 286290
>     seconds
>     >     (3d 7h
>     >     > 31m 30s)
>     >     >
>     >     > ---------------  T H R E A D  ---------------
>     >     >
>     >     > Current thread (0x00007fca6c074000):  GCTaskThread "ParGC
>     >     Thread#28"
>     >     > [stack: 0x00007fba72ff1000,0x00007fba730f1000] [id=56530]
>     >     >
>     >     > Stack: [0x00007fba72ff1000,0x00007fba730f1000],
>     >     >  sp=0x00007fba730ee850,  free space=1014k
>     >     > Native frames: (J=compiled Java code, A=aot compiled Java
>     code,
>     >     > j=interpreted, Vv=VM code, C=native code)
>     >     > V  [libjvm.so+0xd183c0]
>     >      PSRootsClosure<false>::do_oop(oopDesc**)+0x30
>     >     > V  [libjvm.so+0xc6bf0b]  OopMapSet::oops_do(frame const*,
>     >     RegisterMap
>     >     > const*, OopClosure*)+0x2eb
>     >     > V  [libjvm.so+0x765489]  frame::oops_do_internal(OopClosure*,
>     >     > CodeBlobClosure*, RegisterMap*, bool)+0x99
>     >     > V  [libjvm.so+0xf68b17]  JavaThread::oops_do(OopClosure*,
>     >     > CodeBlobClosure*)+0x187
>     >     > V  [libjvm.so+0xd190be]
>      ThreadRootsTask::do_it(GCTaskManager*,
>     >     > unsigned int)+0x6e
>     >     > V  [libjvm.so+0x7f422b]  GCTaskThread::run()+0x1eb
>     >     > V  [libjvm.so+0xf707fd]  Thread::call_run()+0x10d
>     >     > V  [libjvm.so+0xc875b7]  thread_native_entry(Thread*)+0xe7
>     >     >
>     >     > JavaThread 0x00007fb8f4036800 (nid = 60927) was being
>     processed
>     >     > Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
>     >     > v  ~RuntimeStub::_new_array_Java
>     >     > J 58520 c2
>     >     >
>     >
>      ch.qos.logback.classic.spi.ThrowableProxy.<init>(Ljava/lang/Throwable;)V
>     >
>     >     > (207 bytes) @ 0x00007fca5fd23dec
>     >     [0x00007fca5fd1dbc0+0x000000000000622c]
>     >     > J 66864 c2
>     webservice.exception.ExceptionLoggingWrapper.execute()V
>     >     > (1004 bytes) @ 0x00007fca60c02588
>     >     [0x00007fca60bffce0+0x00000000000028a8]
>     >     > J 58224 c2
>     >     >
>     >
>      webservice.exception.mapper.AbstractExceptionMapper.toResponse(Lbeans/exceptions/mapper/V3ErrorCode;Ljava/lang/Exception;)Ljavax/ws/rs/core/Response;
>     >
>     >     > (105 bytes) @ 0x00007fca5f59bad8
>     >     [0x00007fca5f59b880+0x0000000000000258]
>     >     > J 69992 c2
>     >     >
>     >
>      webservice.exception.mapper.JediRequestBlockedExceptionMapper.toResponse(Ljava/lang/Exception;)Ljavax/ws/rs/core/Response;
>     >
>     >     > (9 bytes) @ 0x00007fca5e1019f4
>     >     [0x00007fca5e101940+0x00000000000000b4]
>     >     > J 55265 c2
>     >     >
>     >
>      webservice.filters.ResponseSerializationWorker.processException()Ljava/io/InputStream;
>     >
>     >     > (332 bytes) @ 0x00007fca5f6f58e0
>     >     [0x00007fca5f6f5700+0x00000000000001e0]
>     >     > J 483122 c2
>     >  webservice.filters.ResponseSerializationWorker.execute()Z
>     >     > (272 bytes) @ 0x00007fca622fc2b4
>     >     [0x00007fca622fbc80+0x0000000000000634]
>     >     > J 15811% c2
>     >     >
>     >
>      com.lafaspot.common.concurrent.internal.WorkerManagerOneThread.call()Lcom/lafaspot/common/concurrent/internal/WorkerManagerState;
>     >
>     >     > (486 bytes) @ 0x00007fca5c108794
>     >     [0x00007fca5c1082a0+0x00000000000004f4]
>     >     > j
>     >     >
>     >
>       com.lafaspot.common.concurrent.internal.WorkerManagerOneThread.call()Ljava/lang/Object;+1
>     >     > J 4586 c1 java.util.concurrent.FutureTask.run()V
>     >     java.base at 13.0.1 (123
>     >     > bytes) @ 0x00007fca54d27184
>     [0x00007fca54d26b00+0x0000000000000684]
>     >     > J 7550 c1
>     >     >
>     >
>      java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
>     >
>     >     > java.base at 13.0.1 (187 bytes) @ 0x00007fca54fbb6d4
>     >     > [0x00007fca54fba8e0+0x0000000000000df4]
>     >     > J 7549 c1
>     java.util.concurrent.ThreadPoolExecutor$Worker.run()V
>     >     > java.base at 13.0.1 (9 bytes) @ 0x00007fca5454b93c
>     >     > [0x00007fca5454b8c0+0x000000000000007c]
>     >     > J 4585 c1 java.lang.Thread.run()V java.base at 13.0.1 (17
>     bytes) @
>     >     > 0x00007fca54d250f4 [0x00007fca54d24fc0+0x0000000000000134]
>     >     > v  ~StubRoutines::call_stub
>     >     >
>     >     > siginfo: si_signo: 11 (SIGSEGV), si_code: 128 (SI_KERNEL),
>     si_addr:
>     >     > 0x0000000000000000
>     >     >
>     >     > Does JDK11 and 13 have different code for GC. Do you think
>     >     > downgrading(JDK11 stable)/upgrading(JDK-13.0.2) might help
>     here?
>     >
>     >     You should at least move to 13.0.2, to get the latest bug
>     >     fixes/patches.
>     >
>     >     There has been a lot of changes in all areas of the JVM
>     between 11
>     >     and
>     >     13. We don't yet know the root cause of this crash, and I can't
>     >     say if
>     >     this is caused by new changes or not. Have you or anyone
>     filed a bug
>     >     report for this?
>     >
>     >     > Any insight to debug this will be helpful.
>     >
>     >     Did you try my previous suggestion to run with
>     -XX:+VerifyBeforeGC
>     >     and
>     >     -XX:+VerifyAfterGC? If you can tolerate the longer GC times it
>     >     introduces, then you could try to run with
>     >     -XX:+UnlockDiagnosticVMOptions -XX:+VerifyBeforeGC
>     >     -XX:+VerifyAfterGC .
>     >
>     >     Cheers,
>     >     StefanK
>     >
>     >     >
>     >     > TIA
>     >     > Sundar
>     >     >
>     >     > On Tue, Feb 4, 2020 at 5:47 AM Stefan Karlsson
>     >     > <stefan.karlsson at oracle.com
>     <mailto:stefan.karlsson at oracle.com>
>     <mailto:stefan.karlsson at oracle.com
>     <mailto:stefan.karlsson at oracle.com>>
>     >     <mailto:stefan.karlsson at oracle.com
>     <mailto:stefan.karlsson at oracle.com>
>     >     <mailto:stefan.karlsson at oracle.com
>     <mailto:stefan.karlsson at oracle.com>>>> wrote:
>     >     >
>     >     >     Hi Sundar,
>     >     >
>     >     >     The GC crashes when it encounters something bad on the
>     stack:
>     >     >      > V  [libjvm.so+0xc6bf0b] OopMapSet::oops_do(frame
>     const*,
>     >     >     RegisterMap
>     >     >      > const*, OopClosure*)+0x2eb
>     >     >      > V  [libjvm.so+0x765489]
>     frame::oops_do_internal(OopClosure*,
>     >     >
>     >     >     This is probably not a GC bug. It's more likely that
>     this is
>     >     >     caused by
>     >     >     the JIT compiler. I see in your hotspot-runtime-dev
>     thread,
>     >     that you
>     >     >     also get crashes in other compiler related areas.
>     >     >
>     >     >     If you want to rule out the GC, you can run with
>     >     >     -XX:+VerifyBeforeGC and
>     >     >     -XX:+VerifyAfterGC, and see if this asserts before the GC
>     >     has started
>     >     >     running.
>     >     >
>     >     >     StefanK
>     >     >
>     >     >     On 2020-02-04 04:38, Sundara Mohan M wrote:
>     >     >     > Hi,
>     >     >     >     I am seeing following crashes frequently on our
>     servers
>     >     >     > #
>     >     >     > # A fatal error has been detected by the Java Runtime
>     >     Environment:
>     >     >     > #
>     >     >     > #  SIGSEGV (0xb) at pc=0x00007fca3281d311, pid=103575,
>     >     tid=108299
>     >     >     > #
>     >     >     > # JRE version: OpenJDK Runtime Environment
>     (13.0.1+9) (build
>     >     >     13.0.1+9)
>     >     >     > # Java VM: OpenJDK 64-Bit Server VM (13.0.1+9, mixed
>     mode,
>     >     >     tiered, parallel
>     >     >     > gc, linux-amd64)
>     >     >     > # Problematic frame:
>     >     >     > # V  [libjvm.so+0xcd3311]
>     >     >     PCMarkAndPushClosure::do_oop(oopDesc**)+0x51
>     >     >     > #
>     >     >     > # No core dump will be written. Core dumps have been
>     disabled.
>     >     >     To enable
>     >     >     > core dumping, try "ulimit -c unlimited" before starting
>     >     Java again
>     >     >     > #
>     >     >     > # If you would like to submit a bug report, please
>     visit:
>     >     >     > #
>     https://github.com/AdoptOpenJDK/openjdk-build/issues
>     <https://urldefense.com/v3/__https://github.com/AdoptOpenJDK/openjdk-build/issues__;!!GqivPVa7Brio!PT70czeXdrqB3tGHjnmJA2Ds3bxId1GpNsjq5FfqX84mHS8aVWbikUqEuOqvwXQ0vVOn$>
>     >     >     > #
>     >     >     >
>     >     >     >
>     >     >     > ---------------  T H R E A D ---------------
>     >     >     >
>     >     >     > Current thread (0x00007fca2c051000): GCTaskThread "ParGC
>     >     >     Thread#8" [stack:
>     >     >     > 0x00007fca30277000,0x00007fca30377000] [id=108299]
>     >     >     >
>     >     >     > Stack: [0x00007fca30277000,0x00007fca30377000],
>     >     >     sp=0x00007fca30374890,
>     >     >     >   free space=1014k
>     >     >     > Native frames: (J=compiled Java code, A=aot compiled
>     Java
>     >     code,
>     >     >     > j=interpreted, Vv=VM code, C=native code)
>     >     >     > V  [libjvm.so+0xcd3311]
>     >     PCMarkAndPushClosure::do_oop(oopDesc**)+0x51
>     >     >     > V  [libjvm.so+0xc6bf0b] OopMapSet::oops_do(frame const*,
>     >     >     RegisterMap
>     >     >     > const*, OopClosure*)+0x2eb
>     >     >     > V  [libjvm.so+0x765489]
>     frame::oops_do_internal(OopClosure*,
>     >     >     > CodeBlobClosure*, RegisterMap*, bool)+0x99
>     >     >     > V  [libjvm.so+0xf68b17] JavaThread::oops_do(OopClosure*,
>     >     >     > CodeBlobClosure*)+0x187
>     >     >     > V  [libjvm.so+0xcce2f0]
>     >     >  ThreadRootsMarkingTask::do_it(GCTaskManager*,
>     >     >     > unsigned int)+0xb0
>     >     >     > V  [libjvm.so+0x7f422b] GCTaskThread::run()+0x1eb
>     >     >     > V  [libjvm.so+0xf707fd] Thread::call_run()+0x10d
>     >     >     > V  [libjvm.so+0xc875b7]
>     thread_native_entry(Thread*)+0xe7
>     >     >     >
>     >     >     > JavaThread 0x00007fb85c004800 (nid = 111387) was being
>     >     processed
>     >     >     > Java frames: (J=compiled Java code, j=interpreted,
>     Vv=VM code)
>     >     >     > v  ~RuntimeStub::_new_array_Java
>     >     >     > J 225122 c2
>     >     >     >
>     >     >
>     >
>       ch.qos.logback.classic.spi.ThrowableProxy.<init>(Ljava/lang/Throwable;)V
>     >     >     > (207 bytes) @ 0x00007fca21f1a5d8
>     >     >     [0x00007fca21f17f20+0x00000000000026b8]
>     >     >     > J 62342 c2
>     >     >  webservice.exception.ExceptionLoggingWrapper.execute()V (1004
>     >     >     > bytes) @ 0x00007fca20f0aec8
>     >     [0x00007fca20f07f40+0x0000000000002f88]
>     >     >     > J 225129 c2
>     >     >     >
>     >     >
>     >
>       webservice.exception.mapper.AbstractExceptionMapper.toResponse(Lbeans/exceptions/mapper/V3ErrorCode;Ljava/lang/Exception;)Ljavax/ws/rs/core/Response;
>     >     >     > (105 bytes) @ 0x00007fca1da512ac
>     >     >     [0x00007fca1da51100+0x00000000000001ac]
>     >     >     > J 131643 c2
>     >     >     >
>     >     >
>     >
>       webservice.exception.mapper.RequestBlockedExceptionMapper.toResponse(Ljava/lang/Exception;)Ljavax/ws/rs/core/Response;
>     >     >     > (9 bytes) @ 0x00007fca20ce6190
>     >     >     [0x00007fca20ce60c0+0x00000000000000d0]
>     >     >     > J 55114 c2
>     >     >     >
>     >     >
>     >
>       webservice.filters.ResponseSerializationWorker.processException()Ljava/io/InputStream;
>     >     >     > (332 bytes) @ 0x00007fca2051fe64
>     >     >     [0x00007fca2051f820+0x0000000000000644]
>     >     >     > J 57859 c2
>     >     >  webservice.filters.ResponseSerializationWorker.execute()Z
>     (272
>     >     >     > bytes) @ 0x00007fca1ef2ed18
>     >     [0x00007fca1ef2e140+0x0000000000000bd8]
>     >     >     > J 16114% c2
>     >     >     >
>     >     >
>     >
>       com.lafaspot.common.concurrent.internal.WorkerManagerOneThread.call()Lcom/lafaspot/common/concurrent/internal/WorkerManagerState;
>     >     >     > (486 bytes) @ 0x00007fca1ced465c
>     >     >     [0x00007fca1ced4200+0x000000000000045c]
>     >     >     > j
>     >     >     >
>     >     >
>     >
>        com.lafaspot.common.concurrent.internal.WorkerManagerOneThread.call()Ljava/lang/Object;+1
>     >     >     > J 11639 c2 java.util.concurrent.FutureTask.run()V
>     >     >     java.base at 13.0.1 (123
>     >     >     > bytes) @ 0x00007fca1cd00858
>     >     [0x00007fca1cd007c0+0x0000000000000098]
>     >     >     > J 7560 c1
>     >     >     >
>     >     >
>     >
>       java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
>     >     >     > java.base at 13.0.1 (187 bytes) @ 0x00007fca15b23f54
>     >     >     > [0x00007fca15b23160+0x0000000000000df4]
>     >     >     > J 5143 c1
>     >     java.util.concurrent.ThreadPoolExecutor$Worker.run()V
>     >     >     > java.base at 13.0.1 (9 bytes) @ 0x00007fca15b39abc
>     >     >     > [0x00007fca15b39a40+0x000000000000007c]
>     >     >     > J 4488 c1 java.lang.Thread.run()V java.base at 13.0.1 (17
>     >     bytes) @
>     >     >     > 0x00007fca159fc174
>     [0x00007fca159fc040+0x0000000000000134]
>     >     >     > v  ~StubRoutines::call_stub
>     >     >     >
>     >     >     > siginfo: si_signo: 11 (SIGSEGV), si_code: 128
>     (SI_KERNEL),
>     >     si_addr:
>     >     >     > 0x0000000000000000
>     >     >     >
>     >     >     > Register to memory mapping:
>     >     >     > ...
>     >     >     >
>     >     >     > Can someone shed more info on when this can happen? I am
>     >     seeing
>     >     >     this on
>     >     >     > multiple servers with Java 13.0.1+9 on RHEL6 servers.
>     >     >     >
>     >     >     > There was another thread in hotspot runtime where
>     David Holmes
>     >     >     pointed this
>     >     >     >> siginfo: si_signo: 11 (SIGSEGV), si_code: 128
>     >     (SI_KERNEL), si_addr:
>     >     >     > 0x0000000000000000
>     >     >     >
>     >     >     >> This seems it may be related to:
>     >     >     >> https://bugs.openjdk.java.net/browse/JDK-8004124
>     >     >     >
>     >     >     > Just wondering if this is same or something to do
>     with GC
>     >     specific.
>     >     >     >
>     >     >     >
>     >     >     >
>     >     >     > TIA
>     >     >     > Sundar
>     >     >     >
>     >     >
>     >
>




More information about the hotspot-gc-dev mailing list