RFR (L): 7145569 G1: optimize nmethods scanning
Thomas Schatzl
thomas.schatzl at oracle.com
Wed Aug 14 07:30:07 PDT 2013
Hi all,
can I have some reviews for the latest version for the 7145569 patch?
If possible from one person from the compiler team too.
This RFR is a continuation of a thread from June [1], where John
Cuthbertson proposed this change for the first time.
We performed many stability and performance tests with it in the
meantime, and found a bug - hence the re-review request.
Summarizing the original issue:
We found cases where G1 code cache scan to be the dominator of pause
time. An additional problem has been that entire code cache scan has
been performed by a single thread, other threads waiting for it. Since
G1 can only determine that an nmethod does not need to to be scanned any
more during full gc, this also resulted in linearly increasing gc times
until the next full gc.
The original change attempts to fix this by keeping a per-region list of
nmethods (compiled code) that have references into that particular
regions. The lists are updated when the state of the compiled code
changes. During pause time, at normal gc, instead of scanning the entire
code cache, the gc only scans the ones of regions that are affected by
the evacuation. During initial mark we still scan all regions, but on a
per-region basis, and during full gc the per-region lists are simply
rebuilt.
This change can be found at [2].
The problem with the original patch has been in removing nmethods from
the regions while code cache memory is reclaimed: this update has not
been synchronized with the appropriate lock (the CodeCache_lock), so it
happened that these lists got corrupted.
The fix for this can be found at [3]: it adds this synchronization. This
fix is somewhat complicated by the fact that while holding the
Patching_lock (where the original call to the deregistration was
located) we cannot grab the CodeCache_lock, as this would potentially
cause a deadlock.
So the change remembers that we need to unregister the nmethod later,
and does it later.
Other changes are mostly concerned with adding or fixing asserts that
fail now because we try to iterate over the oops of the nmethod later
than expected.
There is a webrev containing the complete patch at [4].
Testing result summary:
- no performance changes
- negligible overhead of the additional per region nmethod lists
("remembered sets")
- very small code cache scanning times during regular gc
- much more balanced code scan times during initial mark pause. There is
still imbalance due to sometimes heavily biased distribution of nmethods
across regions. This will probably need to be fixed in a follow-up CR.
Bugs.sun.com link:
http://bugs.sun.com/view_bug.do?bug_id=7145569
Testing (in addition to the one performed using the original change):
Specjbb2013, JPRT, Weblogic, nashorn javascript benchmark suite with
reduced code cache size, internal test suites
Thanks,
Thomas
[1]
http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013-June/007591.html
[2] http://cr.openjdk.java.net/~tschatzl/7145569/webrev.johnc/
[3] http://cr.openjdk.java.net/~tschatzl/7145569/webrev.1.updates/
[4] http://cr.openjdk.java.net/~tschatzl/7145569/webrev.1.complete/
More information about the hotspot-compiler-dev
mailing list