RFR: 8048556: Unnecessary GCLocker-initiated young GCs
Kim Barrett
kim.barrett at oracle.com
Tue Jul 30 17:59:19 UTC 2019
RFR: 8048556: Unnecessary GCLocker-initiated young GCs
[Fairoz and I have agreed that I should take over this bug, in order
to take things in a somewhat different direction.]
Please review this change to the handling of GCLocker collections, to
prevent unnecessary back-to-back GCs with the second having a nearly
empty younggen to work with.
There are two parts to this change.
Part 1:
When unlocking the last JNI critical section, we record the current
number of collections for use when the associated _gc_locker
collection is considered for VMOp enqueuing.
Each of the STW collectors handles a _gc_locker cause specially,
checking that the current number of collections is the same as when
the JNI critical section ended. This allows us to discard the
_gc_locker request if there has been some other GC since the critical
section ended, eliminating a race that could otherwise result in the
_gc_locker collection immediately following some other collection. If
that test passes, it's still possible for the _gc_locker VMOp to be
discarded because of another GC VMOp being ahead of it; that's as
intended.
This change is addressing the first problem discussed in Tony's
analysis, the race after the GCLocker becomes inactive. This change
is sufficient to eliminate unnecessary _gc_locker collections
immediately following another collection for G1. It reduces, but does
not eliminate the problem for the other STW collectors.
Note that the (G1-specific) fix for JDK-7143858 is probably wrong.
The GCLocker stalling should probably only be done if
GCLocker::is_active_and_needs_gc() (like the other STW collectors),
rather than just GCLocker::needs_gc(). What was done there is, I
think, sort of an incomplete version of one of Tony's "Fix 1", and has
some similar problems, as well as being insufficient on its own, and
no longer needed with the changes here. Reconsidering JDK-7143858 can
be addressed later.
Part 2:
Fix confusion over full/nonfull collections in the non-G1 STW
collectors. (G1 doesn't have this problem.) ParallelGC and the GCH
collectors were initializing VM_GC_Operation::_full to true, always,
and then have some kludgy workaround attempts for some GCCauses.
Instead, _full should be set appropriately for the cause (making
skip_operation work properly), and used appropriately elsewhere.
There remains some confusion about "full" collections around
GTH::do_collection, but it doesn't seem to lead to unnecessary
_gc_locker collections or other problems. That can be addressed
later.
This change addresses the second problem discussed in Tony's analysis,
the treatment of _gc_locker collections as sort of full and sort of
not. This change reduced but did not eliminate unnecessary _gc_locker
collections for ParallelGC and GCH collectors. However, used in
conjunction with the Part 1 changes, they no longer seem to occur.
Running the attached stress test for 10 minutes produced none, where
the baseline without these changes might produce several hundred per
minute for the GCH collectors.
There are no changes to ZGC needed here; ZGC handles _gc_locker in an
entirely different fashion that doesn't have the described problem.
There are no changes to Shenandoah here; Shenandoah doesn't use the
GCLocker mechanism, instead using object pinning.
There are no changes to Epsilon here; Epsilon just ignores _gc_locker
GC requests.
The attached stress test is based on JNICriticalStressTest2.java from
the bug report, converted into a jtreg test. The functional test is
run as a subprocess inheriting framework supplied options, with the
associated GC log checked for GCLocker collections with small from
space sizes; the test fails if there are any. The test relies on GC
option rotation by the testing framework to provide coverage of the
various collectors.
CR:
https://bugs.openjdk.java.net/browse/JDK-8048556
Webrev:
http://cr.openjdk.java.net/~kbarrett/8048556/open.00/
Testing:
mach5 tier1-5
locally ran new test for 30 minutes with each STW collector.
More information about the hotspot-gc-dev
mailing list