RFR: [8u] OOME in SurrogateLockerThread deadlocks the GC cycle
Aleksey Shipilev
shade at redhat.com
Tue Oct 3 07:44:47 UTC 2017
http://cr.openjdk.java.net/~shade/shenandoah/8u-pll-deadlock/webrev.01/
This bug is 8u-only, and it leads to deadlocks, especially with newer aggressive tests like we have
in new batch of backport changes.
It goes like this. There is a SurrogateLockerThread (SLT) that runs in VMOperation prolog and epilog
to lock the ReferenceProcessor::pending_list_lock (PLL). The trouble there is that SLT is
JavaThread, and it has to perform WB on PLL before locking it, which may run into OOME, then
*blocking* the SLT from returning, and thus effectively blocking the VMOperation in prolog,
deadlocking the GC cycle, and entire VM with it.
We have mitigated this earlier by pre-evacuating PLL oop at final mark. However, this is not
sufficient, because that pre-evac may fail (!), and then the deadlock unfolds again. We can break
the deadlock by letting SLT continue into Full GC at that point. Also, moving PLL evac earlier makes
this less likely to happen (but happens in tests nevertheless).
This machinery is gone in 9+, and thus only 8u is affected.
Testing: hotspot_gc_shenandoah, {fastdebug|release} (+ backport tests that used to reliably timeout
due to deadlocks)
Thanks,
-Aleksey
More information about the shenandoah-dev
mailing list