JVM hanging when using G1GC on JDK8 b78 or b79 (Linux 32 bit)
John Cuthbertson
john.cuthbertson at oracle.com
Wed Mar 6 18:04:16 UTC 2013
Hi Everyone,
All:
I've looked at the bug report (haven't tried to reproduce it yet) and
Bengt's analysis is correct. The concurrent mark thread is entering the
synchronization protocol in a marking step call. That code is waiting
for some non-existent workers to terminate before proceeding. Normally
we shouldn't be entering that code but I think we overflowed the global
marking stack (I updated the CR at ~1am my time with that conjecture). I
think I missed a set_phase() call to tell the parallel terminator that
we only have one thread and it's picking up the number of workers that
executed the remark parallel task.
Thomas: you were on the right track with your comment about the marking
stack size.
David:
Thanks for helping out here. The stack trace you mentioned was for one
the refinement threads - a concurrent GC thread. When a concurrent GC
thread "joins" the suspendible thread set, it means that it will observe
and participate in safepoint operations, i.e. the thread will notice
that it should reach a safepoint and the safepoint synchronizer code
will wait for it to block. When we wish a concurrent GC thread to not
observe safepoints, that thread leaves the suspendible thread set. I
think the name could be a bit better and Tony, before he left, had a
change that used a scoped object to join and leave the STS that hasn't
been integrated yet. IIRC Tony wasn't happy with the name he chose for
that also.
Uwe:
Thanks for bringing this up and my apologies for not replying sooner. I
will have a fix fairly soon. If I'm correct about it being caused by
overflowing the marking stack you can work around the issue by
increasing the MarkStackSize.you could try increasing it to 2M or 4M
entries (which is the current max size).
Cheers,
JohnC
On 3/6/2013 5:43 AM, Thomas Schatzl wrote:
> Hi,
>
> On Wed, 2013-03-06 at 13:49 +0100, Uwe Schindler wrote:
>> Hi Bengt,
>>
>> That was fast! We are happy that you were able to analyze the bug and will fix it soon. To not make our Jenkins server get stuck in the tests, I will disable G1GC until a new update is installed. We will then only test the other garbage collectors with Lucene.
>>
>> Do you have an idea, why this bug is not appearing on 64 bit? It might be caused by other GC behavior as the word size is different (the Lucene tests use -Xmx512M, so its fixed in 32 and 64 bit at the moment). I just want to understand this! I can run the test suite with 64 bit JDK over and over, it never hangs. But when running with 32 bit it hangs in all cases.
> one possible reason is that the default mark stack size much is larger
> on 64 bit, so no mark stack overflow occurs.
>
> E.g. in globals.hpp:
>
> product(uintx, MarkStackSizeMax, NOT_LP64(4*M) LP64_ONLY(512*M),
> \
>
> You may want to try to set MarkStackSizeMax to 4M on 64 bit too to test
> this.
>
> This is just a hunch though.
>
> Thomas
>
>
More information about the hotspot-gc-dev
mailing list