CRR (M): 7132029: G1: mixed GC phase lasts for longer than it should

Tony Printezis tony.printezis at oracle.com
Tue Feb 14 15:18:29 UTC 2012


Hi all,

I'd like a couple of (quick please!) code reviews for this change:

http://cr.openjdk.java.net/~tonyp/7132029/webrev.0/

The policy that was choosing old regions for collection during mixed GCs 
was buggy and it could exhibit many strange behaviors, one of which was 
to go into mixed GC mode, do "mixed GCs" (which they did not actually 
collect any old regions), and not get out of it for a while while 
preventing subsequent marking cycles not to start. This frequently 
caused evac failures and Full GCs.

The logic on which old regions to add to the CSet was spread to many 
places. I simplified that and put it mainly in the loop in the 
finalize_cset() method (renamed from choose_collection_set(), "finalize" 
is more descriptive on what the method does). I think the new version is 
easier to follow.

Description of the changes:

* I have introduced min and max number of old regions to be added to the 
CSet of each mixed GC. Max number is calculated as a percentage over the 
heap size (default: 10% - thanks to Jesper for suggesting to use a heap 
percentage for this) and ensures that collections will not get super 
long. Min number is calculated based on a desired mixed GC num after a 
marking cycle (default: 4) and ensures that each mixed GC will make some 
progress in collecting old regions (so that the candidate old regions 
are collected in a timely fashion).

* We now don't add any regions with live percentage over a threshold 
(default: 95%) to the CSet chooser and we do not consider them for 
collection.

* I stopped using the cache in the CSet chooser class (it was used to 
resort regions according to their latest GC efficiency), since it's not 
necessary any more: we'll now go through all the candidate old regions 
in the CSet chooser class and we don't have a heuristic of when to stop 
mixed GCs based on GC efficiency.

* I introduced the notion of "reclaimable bytes" on HeapRegions, which 
not only includes the predicted garbage bytes on each region, but also 
the unused space in the area [top,end) which will also be reclaimed. The 
CSet chooser class now keeps track of the total reclaimable bytes of all 
the regions that it tracks. If that falls under a certain threshold 
(default: 1% of the heap) we stop doing mixed GCs as we'll reclaim very 
little out of the remaining candidate old regions.

* I eliminated the case where a mixed GC starts and picks no old regions 
to collect (I hope!). Now the information on the CSet chooser (remaining 
region num / remaining reclaimable bytes) can tell us whether we want to 
do more mixed GCs or not. If we do, it's guaranteed that there will be 
old regions to collect. Because of that, I removed the 
_should_revert_to_young_gcs flag as it's not needed any more. It was 
used so that the CSet choice code could flag that we should stop doing 
mixed GCs at the end of the GC. Now, we can decide that by just looking 
at the information on the CSet chooser (the heuristic is encapsulated in 
the next_gc_should_be_mixed()).

* I also changed the policy in the case where a fixed young gen is used. 
Before, we were a bit arbitrary: during mixed GCs we'd cut the young gen 
size in half and fill up the rest with old regions. I thought that 
trying to re-use the "desired mixed GC number" heuristic for this made 
sense. So, I now add the min number of old regions to each mixed GC (as 
long as we don't go over the max). I think this is a more reasonable and 
less arbitrary heuristic to what we had before and it's more consistent 
with the non-fixed young gen policy. I didn't know whether I should 
decrease the young gen size for each mixed GC, like how we did before, 
and by how much. So, I now leave it unchanged (the user decided they 
want a fixed young gen, they will now always get it!).

* Updated all related ergo output to print the new information.

Like the marking changes, this change leaves a fair amount of unused 
code that we can remove. I had to draw the line somewhere on how much I 
should remove now and how much I should leave for later. I opened a new 
CR for the remaining cleanup:

7145441: G1: collection set chooser-related cleanup

Many thanks to Charlie for doing some last-minute performance testing 
with my workspace. He was the one who discovered the problem with the 
never-ending mixed GCs and this fix eliminated the problem. 
Correctness-testing-wise, I've run overnight with the GC test suite.

Thanks,

Tony





More information about the hotspot-gc-dev mailing list