Crash when using java debugger and kafka
Christopher Ng
cng1067 at gmail.com
Thu Oct 31 10:16:23 UTC 2019
Hmm...not great news. I pulled the latest build from
https://builds.shipilev.net/openjdk-shenandoah-jdk8/
(openjdk-shenandoah-jdk8-latest-linux-x86_64-release.tar.xz
38M 2019-Oct-30 15:14).
The bug does not occur when running jdb/mvn (as in the test case)...but it
still occurs when debugging from Intellij IDEA. It's slightly better, now
only 1 core locks up :). Disabling class unloading does seem to stop the
error occurring.
This is the -verbose:gc log from one instance when it did freeze (and then
i kill -9'd it):
Consider -XX:+ClassUnloadingWithConcurrentMark if large pause times are
observed on class-unloading sensitive workloads
Heuristics ergonomically sets -XX:+ExplicitGCInvokesConcurrent
Heuristics ergonomically sets -XX:+ShenandoahImplicitGCInvokesConcurrent
Connected to the target VM, address: '127.0.0.1:0', transport: 'socket'
| 2019-10-31 10:13:19,016 INFO org.facboy.KafkaTestServer: Starting kafka
server.
| 2019-10-31 10:13:19,032 INFO org.facboy.KafkaTestServer: ZooKeeper
instance is successfully started on port 35216
Trigger: Metadata GC Threshold
[Concurrent reset 104M->104M(784M), 0.297 ms]
[Pause Init Mark (process weakrefs) (unload classes), 2.922 ms]
[Concurrent marking (process weakrefs) (unload classes) 105M->105M(784M),
6.062 ms]
[Concurrent precleaning 105M->105M(784M), 0.679 ms]
Disconnected from the target VM, address: '127.0.0.1:0', transport: 'socket'
Process finished with exit code 137 (interrupted by signal 9: SIGKILL)
[Pause Final Mark (process weakrefs) (unload classes)
On Thu, Oct 31, 2019 at 8:10 AM Aleksey Shipilev <shade at redhat.com> wrote:
> On 10/30/19 10:09 PM, Aleksey Shipilev wrote:
> > On 10/30/19 8:42 PM, Roman Kennke wrote:
> >> Our recent jdk8 development introduced the new barrier model (LRB). I
> >> tried your testcase with a recent build and it does not happen for me.
> >> Can you try a suitable build from here:
> >>
> >> https://builds.shipilev.net/openjdk-shenandoah-jdk8/
> >
> > I actually tried that, and the hang is gone.
> >
> > I can reproduce the hang with 8u232, it indeed livelocks during Final
> Mark:
> >
> > Heuristics ergonomically sets -XX:+ShenandoahImplicitGCInvokesConcurrent
> > Trigger: Metadata GC Threshold
> > 2.361: [Concurrent reset 88128K->88128K(2016M), 2.955 ms]
> > 2.364: [Pause Init Mark (process weakrefs) (unload classes), 6.880 ms]
> > 2.371: [Concurrent marking (process weakrefs) (unload classes)
> 88128K->89152K(2016M), 5.414 ms]
> > 2.376: [Concurrent precleaning 89152K->89152K(2016M), 1.677 ms]
> > 2.378: [Pause Final Mark (process weakrefs) (unload classes)
> >
> > ...when doing JvmtiTagMap::weak_oops_do. I might study how that happens
> in 8u232, to make sure head
> > sh/jdk8 does not work by accident.
>
> Found it. We seem to be entering JvmtiTagMap::do_weak_oops by all GC
> threads when evacuating the
> roots, which breaks in all sorts of weird ways when it tries to resize the
> underlying hash table
> racily.
>
> Here is the fix for 8u232:
>
> diff -r 309b496da750
> src/share/vm/gc_implementation/shenandoah/shenandoahRootProcessor.cpp
> ---
> a/src/share/vm/gc_implementation/shenandoah/shenandoahRootProcessor.cpp
> Thu Oct 10 18:16:48
> 2019 +0100
> +++
> b/src/share/vm/gc_implementation/shenandoah/shenandoahRootProcessor.cpp
> Thu Oct 31 09:06:51
> 2019 +0100
> @@ -257,11 +257,11 @@
> if (blobs != NULL) {
> ShenandoahWorkerTimingsTracker timer(worker_times,
> ShenandoahPhaseTimings::CodeCacheRoots,
> worker_id);
> _coderoots_cset_iterator.possibly_parallel_blobs_do(blobs);
> }
>
> - if (_evacuation_tasks->is_task_claimed(SHENANDOAH_EVAC_jvmti_oops_do)) {
> + if (!_evacuation_tasks->is_task_claimed(SHENANDOAH_EVAC_jvmti_oops_do))
> {
> ShenandoahForwardedIsAliveClosure is_alive;
> ShenandoahWorkerTimingsTracker timer(worker_times,
> ShenandoahPhaseTimings::JVMTIRoots, worker_id);
> JvmtiExport::weak_oops_do(&is_alive, oops);
> }
> }
>
> It was *accidentally* fixed with LRB backport:
>
>
> https://hg.openjdk.java.net/shenandoah/jdk8/hotspot/file/e9d60bdac4b5/src/share/vm/gc_implementation/shenandoah/shenandoahRootProcessor.cpp#l296
>
> Bottom line:
> 1) 8u232 is broken with lots of JVMTI tags (which is the case for
> debugging);
> 2) 8u232 workaround is to treat all JVMTI tags as strongly reachable
> (-XX:-ClassUnloading)
> 3) Current sh/jdk8 is immune to this, by happy accident, and would
> continue to be immune.
>
> Christopher, does this work for you?
>
> --
> Thanks,
> -Aleksey
>
>
More information about the shenandoah-dev
mailing list