RFR (L): 7145569 G1: optimize nmethods scanning
Hi all, can I have some reviews for the latest version for the 7145569 patch? If possible from one person from the compiler team too. This RFR is a continuation of a thread from June [1], where John Cuthbertson proposed this change for the first time. We performed many stability and performance tests with it in the meantime, and found a bug - hence the re-review request. Summarizing the original issue: We found cases where G1 code cache scan to be the dominator of pause time. An additional problem has been that entire code cache scan has been performed by a single thread, other threads waiting for it. Since G1 can only determine that an nmethod does not need to to be scanned any more during full gc, this also resulted in linearly increasing gc times until the next full gc. The original change attempts to fix this by keeping a per-region list of nmethods (compiled code) that have references into that particular regions. The lists are updated when the state of the compiled code changes. During pause time, at normal gc, instead of scanning the entire code cache, the gc only scans the ones of regions that are affected by the evacuation. During initial mark we still scan all regions, but on a per-region basis, and during full gc the per-region lists are simply rebuilt. This change can be found at [2]. The problem with the original patch has been in removing nmethods from the regions while code cache memory is reclaimed: this update has not been synchronized with the appropriate lock (the CodeCache_lock), so it happened that these lists got corrupted. The fix for this can be found at [3]: it adds this synchronization. This fix is somewhat complicated by the fact that while holding the Patching_lock (where the original call to the deregistration was located) we cannot grab the CodeCache_lock, as this would potentially cause a deadlock. So the change remembers that we need to unregister the nmethod later, and does it later. Other changes are mostly concerned with adding or fixing asserts that fail now because we try to iterate over the oops of the nmethod later than expected. There is a webrev containing the complete patch at [4]. Testing result summary: - no performance changes - negligible overhead of the additional per region nmethod lists ("remembered sets") - very small code cache scanning times during regular gc - much more balanced code scan times during initial mark pause. There is still imbalance due to sometimes heavily biased distribution of nmethods across regions. This will probably need to be fixed in a follow-up CR. Bugs.sun.com link: http://bugs.sun.com/view_bug.do?bug_id=7145569 Testing (in addition to the one performed using the original change): Specjbb2013, JPRT, Weblogic, nashorn javascript benchmark suite with reduced code cache size, internal test suites Thanks, Thomas [1] http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013-June/007591.html [2] http://cr.openjdk.java.net/~tschatzl/7145569/webrev.johnc/ [3] http://cr.openjdk.java.net/~tschatzl/7145569/webrev.1.updates/ [4] http://cr.openjdk.java.net/~tschatzl/7145569/webrev.1.complete/
participants (1)
-
Thomas Schatzl