RFR (S) 8137099: OoME with G1 GC before doing a full GC
Siebenborn, Axel
axel.siebenborn at sap.com
Thu Sep 24 15:13:11 UTC 2015
Hi,
we regularly see OoM-Errors with G1 in our stress tests.
We run the tests with the same heap size with ParallelGC and CMS without
that problem.
The stress tests are based on real world application code with a lot of
threads.
Scenario:
We have an application with a lot of threads and spend time in critical
native sections.
1. An evacuation failure happens during a GC.
2. After clean-up work, the safepoint is left.
3. An other thread can't allocate and triggers a new incremental gc.
4. A thread, that can't allocate after an incremental GC, triggers a
full GC. However, the GC doesn't start because an other thread
started an incremental GC, the GC-locker is active or the GCLocker
initiated GC has not yet been performed.
If an incremental GC doesn't succeed due to the GC-locker, and if
this happens more often than GCLockerRetryAllocationCount (=2) an OOME
is thrown.
Without critical native code, we would try to trigger a full gc until we
succeed. In this case there is just a performance issue, but not an OOME.
Despite to other GCs, the safepoint is left after an evacuation failure.
The proposed fix is to start a full GC before leaving the safepoint.
Bug:
https://bugs.openjdk.java.net/browse/JDK-8137099
Webrev:
http://cr.openjdk.java.net/~asiebenborn/8137099/webrev/
Thanks,
Axel
More information about the hotspot-gc-dev
mailing list