Parallel GC Thread crash
Stefan Karlsson
stefan.karlsson at oracle.com
Fri Feb 28 19:18:15 UTC 2020
Hi Sundar,
On 2020-02-28 19:57, Sundara Mohan M wrote:
> Hi Stefan,
> I tried running with -XX:+VerifyBeforeGC and -XX:+VerifyAfterGC
> but it seems some of the operations are timing out (ex. ssl connect
> not sure if i have very low timeout or this flag increases the latency)
The flag increase the latencies, because it runs extra verification
checks in the pauses.
>
> Also from the crash error log i see following
> ...
> 0x00007f0588066000 GCTaskThread "ParGC Thread#20" [stack:
> 0x00007ef5c4d13000,0x00007ef5c4e13000] [id=87534]
> *=>0x00007f0588068000 GCTaskThread "ParGC Thread#21" [stack:
> 0x00007ef5c4c12000,0x00007ef5c4d12000] [id=87535]
> * 0x00007f0588069800 GCTaskThread "ParGC Thread#22" [stack:
> 0x00007ef5c4b11000,0x00007ef5c4c11000] [id=87536]
> ...
> Threads with active compile tasks:
>
> *VM state:at safepoint (normal execution)
> *
>
> VM Mutex/Monitor currently owned by a thread: ([mutex/lock_event])
> [0x00007f0588015750] Threads_lock - owner thread: 0x00007f0588112800
> [0x00007f0588016350] Heap_lock - owner thread: 0x00007ef4100f9000
>
> Does this mean VM is in safepoint and executing GC operation or JVM
> related activity which requires to be in safepoint (ie. not executing
> user code) when it crashed?
Yes, exactly. The Parallel GC does all work in a stop-the-world pause.
StefanK
> I am trying to see if library or application code is causing this
> behaviour.
>
> Thanks
> Sundar
>
> On Mon, Feb 10, 2020 at 3:13 PM Stefan Karlsson
> <stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>> wrote:
>
> On 2020-02-10 20:53, Sundara Mohan M wrote:
> > Hi Stefan,
> > Yes we are trying to move to 13.0.2. Wanted to verify if anyone
> > else seen this or upgrading will really solve this problem.
> >
> > Can you share how to file a bug report for this? I don't have
> access
> > to https://bugs.openjdk.java.net/
>
> There are directions in the hs_err crash file that points you to
> the web
> page to file a bug.
>
> You seem to be running AdoptJDK builds so your bug reports would
> end up
> at their system:
> > > # If you would like to submit a bug report, please visit:
> > > # https://github.com/AdoptOpenJDK/openjdk-build/issues
> <https://urldefense.com/v3/__https://github.com/AdoptOpenJDK/openjdk-build/issues__;!!GqivPVa7Brio!PT70czeXdrqB3tGHjnmJA2Ds3bxId1GpNsjq5FfqX84mHS8aVWbikUqEuOqvwXQ0vVOn$>
>
>
> If you were running with Oracle binaries you would get lines like
> this:
> # If you would like to submit a bug report, please visit:
> # https://bugreport.java.com/bugreport/crash.jsp
> <https://urldefense.com/v3/__https://bugreport.java.com/bugreport/crash.jsp__;!!GqivPVa7Brio!PT70czeXdrqB3tGHjnmJA2Ds3bxId1GpNsjq5FfqX84mHS8aVWbikUqEuOqvwcxkrNHY$>
>
> >
> > I will try to run with -XX:+VerifyBeforeGC and
> -XX:+VerifyAfterGC to
> > get more information.
>
> OK. Hopefully this gives us more information.
>
> StefanK
> >
> >
> > Thanks
> > Sundar
> >
> > On Mon, Feb 10, 2020 at 2:42 PM Stefan Karlsson
> > <stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>
> <mailto:stefan.karlsson at oracle.com
> <mailto:stefan.karlsson at oracle.com>>> wrote:
> >
> > Hi Sundar,
> >
> > On 2020-02-10 19:32, Sundara Mohan M wrote:
> > > Hi Stefan,
> > > We started seeing more crashes on JDK13.0.1+9
> > >
> > > Since seeing it on GC Task Thread assumed it is related to GC.
> >
> > As I said in my previous mail, I don't think this is caused
> by GC
> > code.
> > More below.
> >
> > >
> > > # Problematic frame:
> > > # V [libjvm.so+0xd183c0]
> > PSRootsClosure<false>::do_oop(oopDesc**)+0x30
> > >
> > > Command Line: -XX:+AlwaysPreTouch -Xms64000m -Xmx64000m
> > > -XX:NewSize=40000m -XX:+DisableExplicitGC -Xnoclassgc
> > > -XX:+UseParallelGC -XX:ParallelGCThreads=40 -XX:ConcGCTh
> > > reads=5 ...
> > >
> > > Host: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz, 48 cores,
> 125G,
> > Red
> > > Hat Enterprise Linux Server release 6.10 (Santiago)
> > > Time: Fri Feb 7 11:15:04 2020 UTC elapsed time: 286290
> seconds
> > (3d 7h
> > > 31m 30s)
> > >
> > > --------------- T H R E A D ---------------
> > >
> > > Current thread (0x00007fca6c074000): GCTaskThread "ParGC
> > Thread#28"
> > > [stack: 0x00007fba72ff1000,0x00007fba730f1000] [id=56530]
> > >
> > > Stack: [0x00007fba72ff1000,0x00007fba730f1000],
> > > sp=0x00007fba730ee850, free space=1014k
> > > Native frames: (J=compiled Java code, A=aot compiled Java
> code,
> > > j=interpreted, Vv=VM code, C=native code)
> > > V [libjvm.so+0xd183c0]
> > PSRootsClosure<false>::do_oop(oopDesc**)+0x30
> > > V [libjvm.so+0xc6bf0b] OopMapSet::oops_do(frame const*,
> > RegisterMap
> > > const*, OopClosure*)+0x2eb
> > > V [libjvm.so+0x765489] frame::oops_do_internal(OopClosure*,
> > > CodeBlobClosure*, RegisterMap*, bool)+0x99
> > > V [libjvm.so+0xf68b17] JavaThread::oops_do(OopClosure*,
> > > CodeBlobClosure*)+0x187
> > > V [libjvm.so+0xd190be]
> ThreadRootsTask::do_it(GCTaskManager*,
> > > unsigned int)+0x6e
> > > V [libjvm.so+0x7f422b] GCTaskThread::run()+0x1eb
> > > V [libjvm.so+0xf707fd] Thread::call_run()+0x10d
> > > V [libjvm.so+0xc875b7] thread_native_entry(Thread*)+0xe7
> > >
> > > JavaThread 0x00007fb8f4036800 (nid = 60927) was being
> processed
> > > Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> > > v ~RuntimeStub::_new_array_Java
> > > J 58520 c2
> > >
> >
> ch.qos.logback.classic.spi.ThrowableProxy.<init>(Ljava/lang/Throwable;)V
> >
> > > (207 bytes) @ 0x00007fca5fd23dec
> > [0x00007fca5fd1dbc0+0x000000000000622c]
> > > J 66864 c2
> webservice.exception.ExceptionLoggingWrapper.execute()V
> > > (1004 bytes) @ 0x00007fca60c02588
> > [0x00007fca60bffce0+0x00000000000028a8]
> > > J 58224 c2
> > >
> >
> webservice.exception.mapper.AbstractExceptionMapper.toResponse(Lbeans/exceptions/mapper/V3ErrorCode;Ljava/lang/Exception;)Ljavax/ws/rs/core/Response;
> >
> > > (105 bytes) @ 0x00007fca5f59bad8
> > [0x00007fca5f59b880+0x0000000000000258]
> > > J 69992 c2
> > >
> >
> webservice.exception.mapper.JediRequestBlockedExceptionMapper.toResponse(Ljava/lang/Exception;)Ljavax/ws/rs/core/Response;
> >
> > > (9 bytes) @ 0x00007fca5e1019f4
> > [0x00007fca5e101940+0x00000000000000b4]
> > > J 55265 c2
> > >
> >
> webservice.filters.ResponseSerializationWorker.processException()Ljava/io/InputStream;
> >
> > > (332 bytes) @ 0x00007fca5f6f58e0
> > [0x00007fca5f6f5700+0x00000000000001e0]
> > > J 483122 c2
> > webservice.filters.ResponseSerializationWorker.execute()Z
> > > (272 bytes) @ 0x00007fca622fc2b4
> > [0x00007fca622fbc80+0x0000000000000634]
> > > J 15811% c2
> > >
> >
> com.lafaspot.common.concurrent.internal.WorkerManagerOneThread.call()Lcom/lafaspot/common/concurrent/internal/WorkerManagerState;
> >
> > > (486 bytes) @ 0x00007fca5c108794
> > [0x00007fca5c1082a0+0x00000000000004f4]
> > > j
> > >
> >
> com.lafaspot.common.concurrent.internal.WorkerManagerOneThread.call()Ljava/lang/Object;+1
> > > J 4586 c1 java.util.concurrent.FutureTask.run()V
> > java.base at 13.0.1 (123
> > > bytes) @ 0x00007fca54d27184
> [0x00007fca54d26b00+0x0000000000000684]
> > > J 7550 c1
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
> >
> > > java.base at 13.0.1 (187 bytes) @ 0x00007fca54fbb6d4
> > > [0x00007fca54fba8e0+0x0000000000000df4]
> > > J 7549 c1
> java.util.concurrent.ThreadPoolExecutor$Worker.run()V
> > > java.base at 13.0.1 (9 bytes) @ 0x00007fca5454b93c
> > > [0x00007fca5454b8c0+0x000000000000007c]
> > > J 4585 c1 java.lang.Thread.run()V java.base at 13.0.1 (17
> bytes) @
> > > 0x00007fca54d250f4 [0x00007fca54d24fc0+0x0000000000000134]
> > > v ~StubRoutines::call_stub
> > >
> > > siginfo: si_signo: 11 (SIGSEGV), si_code: 128 (SI_KERNEL),
> si_addr:
> > > 0x0000000000000000
> > >
> > > Does JDK11 and 13 have different code for GC. Do you think
> > > downgrading(JDK11 stable)/upgrading(JDK-13.0.2) might help
> here?
> >
> > You should at least move to 13.0.2, to get the latest bug
> > fixes/patches.
> >
> > There has been a lot of changes in all areas of the JVM
> between 11
> > and
> > 13. We don't yet know the root cause of this crash, and I can't
> > say if
> > this is caused by new changes or not. Have you or anyone
> filed a bug
> > report for this?
> >
> > > Any insight to debug this will be helpful.
> >
> > Did you try my previous suggestion to run with
> -XX:+VerifyBeforeGC
> > and
> > -XX:+VerifyAfterGC? If you can tolerate the longer GC times it
> > introduces, then you could try to run with
> > -XX:+UnlockDiagnosticVMOptions -XX:+VerifyBeforeGC
> > -XX:+VerifyAfterGC .
> >
> > Cheers,
> > StefanK
> >
> > >
> > > TIA
> > > Sundar
> > >
> > > On Tue, Feb 4, 2020 at 5:47 AM Stefan Karlsson
> > > <stefan.karlsson at oracle.com
> <mailto:stefan.karlsson at oracle.com>
> <mailto:stefan.karlsson at oracle.com
> <mailto:stefan.karlsson at oracle.com>>
> > <mailto:stefan.karlsson at oracle.com
> <mailto:stefan.karlsson at oracle.com>
> > <mailto:stefan.karlsson at oracle.com
> <mailto:stefan.karlsson at oracle.com>>>> wrote:
> > >
> > > Hi Sundar,
> > >
> > > The GC crashes when it encounters something bad on the
> stack:
> > > > V [libjvm.so+0xc6bf0b] OopMapSet::oops_do(frame
> const*,
> > > RegisterMap
> > > > const*, OopClosure*)+0x2eb
> > > > V [libjvm.so+0x765489]
> frame::oops_do_internal(OopClosure*,
> > >
> > > This is probably not a GC bug. It's more likely that
> this is
> > > caused by
> > > the JIT compiler. I see in your hotspot-runtime-dev
> thread,
> > that you
> > > also get crashes in other compiler related areas.
> > >
> > > If you want to rule out the GC, you can run with
> > > -XX:+VerifyBeforeGC and
> > > -XX:+VerifyAfterGC, and see if this asserts before the GC
> > has started
> > > running.
> > >
> > > StefanK
> > >
> > > On 2020-02-04 04:38, Sundara Mohan M wrote:
> > > > Hi,
> > > > I am seeing following crashes frequently on our
> servers
> > > > #
> > > > # A fatal error has been detected by the Java Runtime
> > Environment:
> > > > #
> > > > # SIGSEGV (0xb) at pc=0x00007fca3281d311, pid=103575,
> > tid=108299
> > > > #
> > > > # JRE version: OpenJDK Runtime Environment
> (13.0.1+9) (build
> > > 13.0.1+9)
> > > > # Java VM: OpenJDK 64-Bit Server VM (13.0.1+9, mixed
> mode,
> > > tiered, parallel
> > > > gc, linux-amd64)
> > > > # Problematic frame:
> > > > # V [libjvm.so+0xcd3311]
> > > PCMarkAndPushClosure::do_oop(oopDesc**)+0x51
> > > > #
> > > > # No core dump will be written. Core dumps have been
> disabled.
> > > To enable
> > > > core dumping, try "ulimit -c unlimited" before starting
> > Java again
> > > > #
> > > > # If you would like to submit a bug report, please
> visit:
> > > > #
> https://github.com/AdoptOpenJDK/openjdk-build/issues
> <https://urldefense.com/v3/__https://github.com/AdoptOpenJDK/openjdk-build/issues__;!!GqivPVa7Brio!PT70czeXdrqB3tGHjnmJA2Ds3bxId1GpNsjq5FfqX84mHS8aVWbikUqEuOqvwXQ0vVOn$>
> > > > #
> > > >
> > > >
> > > > --------------- T H R E A D ---------------
> > > >
> > > > Current thread (0x00007fca2c051000): GCTaskThread "ParGC
> > > Thread#8" [stack:
> > > > 0x00007fca30277000,0x00007fca30377000] [id=108299]
> > > >
> > > > Stack: [0x00007fca30277000,0x00007fca30377000],
> > > sp=0x00007fca30374890,
> > > > free space=1014k
> > > > Native frames: (J=compiled Java code, A=aot compiled
> Java
> > code,
> > > > j=interpreted, Vv=VM code, C=native code)
> > > > V [libjvm.so+0xcd3311]
> > PCMarkAndPushClosure::do_oop(oopDesc**)+0x51
> > > > V [libjvm.so+0xc6bf0b] OopMapSet::oops_do(frame const*,
> > > RegisterMap
> > > > const*, OopClosure*)+0x2eb
> > > > V [libjvm.so+0x765489]
> frame::oops_do_internal(OopClosure*,
> > > > CodeBlobClosure*, RegisterMap*, bool)+0x99
> > > > V [libjvm.so+0xf68b17] JavaThread::oops_do(OopClosure*,
> > > > CodeBlobClosure*)+0x187
> > > > V [libjvm.so+0xcce2f0]
> > > ThreadRootsMarkingTask::do_it(GCTaskManager*,
> > > > unsigned int)+0xb0
> > > > V [libjvm.so+0x7f422b] GCTaskThread::run()+0x1eb
> > > > V [libjvm.so+0xf707fd] Thread::call_run()+0x10d
> > > > V [libjvm.so+0xc875b7]
> thread_native_entry(Thread*)+0xe7
> > > >
> > > > JavaThread 0x00007fb85c004800 (nid = 111387) was being
> > processed
> > > > Java frames: (J=compiled Java code, j=interpreted,
> Vv=VM code)
> > > > v ~RuntimeStub::_new_array_Java
> > > > J 225122 c2
> > > >
> > >
> >
> ch.qos.logback.classic.spi.ThrowableProxy.<init>(Ljava/lang/Throwable;)V
> > > > (207 bytes) @ 0x00007fca21f1a5d8
> > > [0x00007fca21f17f20+0x00000000000026b8]
> > > > J 62342 c2
> > > webservice.exception.ExceptionLoggingWrapper.execute()V (1004
> > > > bytes) @ 0x00007fca20f0aec8
> > [0x00007fca20f07f40+0x0000000000002f88]
> > > > J 225129 c2
> > > >
> > >
> >
> webservice.exception.mapper.AbstractExceptionMapper.toResponse(Lbeans/exceptions/mapper/V3ErrorCode;Ljava/lang/Exception;)Ljavax/ws/rs/core/Response;
> > > > (105 bytes) @ 0x00007fca1da512ac
> > > [0x00007fca1da51100+0x00000000000001ac]
> > > > J 131643 c2
> > > >
> > >
> >
> webservice.exception.mapper.RequestBlockedExceptionMapper.toResponse(Ljava/lang/Exception;)Ljavax/ws/rs/core/Response;
> > > > (9 bytes) @ 0x00007fca20ce6190
> > > [0x00007fca20ce60c0+0x00000000000000d0]
> > > > J 55114 c2
> > > >
> > >
> >
> webservice.filters.ResponseSerializationWorker.processException()Ljava/io/InputStream;
> > > > (332 bytes) @ 0x00007fca2051fe64
> > > [0x00007fca2051f820+0x0000000000000644]
> > > > J 57859 c2
> > > webservice.filters.ResponseSerializationWorker.execute()Z
> (272
> > > > bytes) @ 0x00007fca1ef2ed18
> > [0x00007fca1ef2e140+0x0000000000000bd8]
> > > > J 16114% c2
> > > >
> > >
> >
> com.lafaspot.common.concurrent.internal.WorkerManagerOneThread.call()Lcom/lafaspot/common/concurrent/internal/WorkerManagerState;
> > > > (486 bytes) @ 0x00007fca1ced465c
> > > [0x00007fca1ced4200+0x000000000000045c]
> > > > j
> > > >
> > >
> >
> com.lafaspot.common.concurrent.internal.WorkerManagerOneThread.call()Ljava/lang/Object;+1
> > > > J 11639 c2 java.util.concurrent.FutureTask.run()V
> > > java.base at 13.0.1 (123
> > > > bytes) @ 0x00007fca1cd00858
> > [0x00007fca1cd007c0+0x0000000000000098]
> > > > J 7560 c1
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
> > > > java.base at 13.0.1 (187 bytes) @ 0x00007fca15b23f54
> > > > [0x00007fca15b23160+0x0000000000000df4]
> > > > J 5143 c1
> > java.util.concurrent.ThreadPoolExecutor$Worker.run()V
> > > > java.base at 13.0.1 (9 bytes) @ 0x00007fca15b39abc
> > > > [0x00007fca15b39a40+0x000000000000007c]
> > > > J 4488 c1 java.lang.Thread.run()V java.base at 13.0.1 (17
> > bytes) @
> > > > 0x00007fca159fc174
> [0x00007fca159fc040+0x0000000000000134]
> > > > v ~StubRoutines::call_stub
> > > >
> > > > siginfo: si_signo: 11 (SIGSEGV), si_code: 128
> (SI_KERNEL),
> > si_addr:
> > > > 0x0000000000000000
> > > >
> > > > Register to memory mapping:
> > > > ...
> > > >
> > > > Can someone shed more info on when this can happen? I am
> > seeing
> > > this on
> > > > multiple servers with Java 13.0.1+9 on RHEL6 servers.
> > > >
> > > > There was another thread in hotspot runtime where
> David Holmes
> > > pointed this
> > > >> siginfo: si_signo: 11 (SIGSEGV), si_code: 128
> > (SI_KERNEL), si_addr:
> > > > 0x0000000000000000
> > > >
> > > >> This seems it may be related to:
> > > >> https://bugs.openjdk.java.net/browse/JDK-8004124
> > > >
> > > > Just wondering if this is same or something to do
> with GC
> > specific.
> > > >
> > > >
> > > >
> > > > TIA
> > > > Sundar
> > > >
> > >
> >
>
More information about the hotspot-gc-dev
mailing list