VM deadlock between VM shutdown and G1

Tue Nov 13 12:57:20 UTC 2018

Hi Krystal,

On Thu, 2018-09-13 at 01:57 -0700, Krystal Mok wrote:
> Hi HotSpot devs,
> 
> I'd like to report a VM deadlock situation we've experienced in
> production a few weeks ago on JDK8. I checked the latest jdk/jdk code
> and the same race condition is still there.
> 
> I can file a JBS ticket if this looks like a valid problem.
> 
> tl;dr: there's a race condition between the VM shutdown and G1, in
> that the VM shutdown sequence may have just terminated all the
> concurrent GC threads, and at the same time G1 has just started an
> incremental collection cycle and is waiting for the concurrent marker
> to finish. Since there's no concurrent GC thread alive to update the
> status anymore, G1 waits indefinitely while holding a safepoint,
> causing the VM to deadlock.

  I looked a bit through the sources and I think the situation you
describe can't happen in jdk/jdk sources.

The analysis assumes that the concurrent mark threads are actually
stopped and shut down properly at the point of the
CollectedHeap::stop() call. However this is not the case.
CollectedHeap::stop() does not stop the marking threads at all (which
is a different issue depending on your POV). I.e. the root region scan
will be finish eventually..

Also the concurrent mark threads should show up in your stack traces -
unfortunately the stack traces you posted did not include them.

After some digging I think that on JDK8u I think what you are
experiencing is JDK-8148992 and/or JDK-8151336. Both were fixed in 9
but simply never backported to 8u.

I am likely to close the issue I opened (JDK-8210706) as duplicate of
JDK-8148992 unless new evidence shows up as I also did not manage to
reproduce the issue.

We can try to get JDK-8148992 and JDK-8151336 backported instead if it
is a significant issue to you.

Thanks,
  Thomas