CRR (M): 7132029: G1: mixed GC phase lasts for longer than it should

Wed Feb 15 15:19:21 UTC 2012

Hi all,

Thanks to John and Bengt for the super prompt code reviews! The latest 
webrev is here:

http://cr.openjdk.java.net/~tonyp/7132029/webrev.1/webrev.all/

The changes based on John's comments are here:

http://cr.openjdk.java.net/~tonyp/7132029/webrev.1/webrev.1.G1ContMixedJohn/

The changes based on Bengt's comments are here:

http://cr.openjdk.java.net/~tonyp/7132029/webrev.1/webrev.2.G1ContMixedBengt/

Bengt,

Quick follow-up on the duplicated ergo messages... I simplified the code 
a bit and now next_gc_should_be_mixed() generates all the ergo messages 
with the correct action string in each case. Before, the messages looked 
like this:

  4.046: [G1Ergonomics (Mixed GCs) next GC should be mixed, reason: 
appropriate old regions available, phase: end of young GC, remaining old 
regions: 15 regions, reclaimable: 9791544 bytes (29.18 %), threshold: 
1.00 %]
  4.046: [G1Ergonomics (Mixed GCs) start mixed GCs, reason: appropriate 
old regions available]
  22M->17M(32M), 0.0309070 secs]
...
  4.204: [G1Ergonomics (Mixed GCs) next GC should be mixed, reason: 
appropriate old regions available, phase: end of mixed GC, remaining old 
regions: 11 regions, reclaimable: 5725080 bytes (17.06 %), threshold: 
1.00 %]
  4.204: [G1Ergonomics (Mixed GCs) do not end mixed GCs, reason: 
appropriate old regions available]
  22M->14M(32M), 0.0369350 secs]
...
  4.597: [G1Ergonomics (Mixed GCs) next GC should not be mixed, reason: 
reclaimable percentage lower than threshold, phase: end of mixed GC, 
remaining old regions: 3 regions, reclaimable: 328208 bytes (0.98 %), 
threshold: 1.00 %]
  4.597: [G1Ergonomics (Mixed GCs) end mixed GCs, reason: appropriate 
old regions not available]
  17M->10M(32M), 0.0523420 secs]

Now, they look like this (I also changed the output a bit, I didn't like 
the negative "do not end mixed GCs" if in the case where we carried on 
doing mixed GCs; and I replaced "appropriate" with "candidate"):

  4.131: [G1Ergonomics (Mixed GCs) start mixed GCs, reason: candidate 
old regions available, candidate old regions: 13 regions, reclaimable: 
9528680 bytes (28.40 %), threshold: 1.00 %]
...
  4.348: [G1Ergonomics (Mixed GCs) continue mixed GCs, reason: candidate 
old regions available, candidate old regions: 9 regions, reclaimable: 
5463560 bytes (16.28 %), threshold: 1.00 %]
  23M->15M(32M), 0.0320210 secs]
...
  4.781: [G1Ergonomics (Mixed GCs) do not continue mixed GCs, reason: 
reclaimable percentage lower than threshold, candidate old regions: 1 
regions, reclaimable: 52464 bytes (0.16 %), threshold: 1.00 %]

I had to do a minor change in the ergo output framework to allow the 
action string to be a variable instead of a hard-coded string, but it 
was pretty pretty trivial. Thanks for the suggestion to simplify this.

Tony

On 02/15/2012 08:47 AM, Tony Printezis wrote:
> Bengt,
>
> Thanks for looking at this at such short notice! Inline.
>
> On 02/15/2012 03:46 AM, Bengt Rutisson wrote:
>>
>> Hi Tony,
>>
>> I really like this cleanup! Thanks for fixing it.
>>
>> Some comments:
>>
>> collectionSetChooser.hpp:
>>
>> _numMarkedRegions is kind of a confusing name to me. It does not 
>> contain all the marked regions. 
>
> Well, I left it as it was. And it's also not called 
> _numAllMarkedRegions. :-) How about I just call it _length (and 
> _curMarkedIndex -> _curr_index)?
>
>> It contains all the old regions that we would like to include in the 
>> collection set. How about _num_selected_regions or even 
>> _num_selected_old_regions?
>>
>> I think the remove(HeapRegion* hr) method is kind of strange. I 
>> understand the need for peek() from finalize_cset(), but I think it 
>> is strange that the remove() dependes on the code first calling 
>> peek(). I'd prefer if we could make this more iterator style or at 
>> least change remove() to someting like pop() that will return the 
>> next HeapRegion instead of taking it as a parameter.
>
> Well, peek() and remove() are only called from the loop in 
> finalize_cset(), so they are tailored for that. I originally had 
> remove() with no parameters that just removed the current region. I 
> then thought of passing the region as a parameter mainly to do sanity 
> testing (i.e., the region that it's being removed is the one we think 
> it should be removed). And I then realized that, given we already have 
> the region, there's no point in actually re-reading it from the array. 
> Si we . I could generalize remove() to accept hr optionally, but given 
> that we only use it when we know what the region to be removed is, I 
> don't see the point. Also, I'm not sure that using HeapRegion* pop() 
> is appropriate either, given that the caller doesn't need to be given 
> the region that just got removed (it already knows which one it is).
>
> Also note that the peek() / remove() pattern is similar to the 
> ListIterator class in Java which has a E next() (equivalent to peek()) 
> and void remove(). I just added extra information to the remove() 
> method. (What I'm trying to say is: there's precedence to such a 
> pattern.)
>
>>
>> collectionSetChooser.cpp:
>>
>> Why is the method CollectionSetChooser::updateAfterFullCollection() 
>> needed? Can't G1CollectorPolicy::record_full_collection_end() do 
>> _collectionSetChooser->clearMarkedHeapRegions() just like 
>> G1CollectorPolicy::record_concurrent_mark_cleanup_end() does?
>
> I'm OK either way. Maybe we used to do more in 
> updateAfterFullCollection() once upon the time but we don't do any 
> more. I'll remove it and replace it with a direct call to 
> clearMarkedHeapRegions() as you suggested.
>
>> g1_globals.hpp:
>>
>> It seems to me that at least the G1MaxMixedGCNum and the 
>> G1OldCSetRegionThresholdPercent can be things that end users might 
>> need to tweak until we have the heuristics more thoroughly tested 
>> out. Would it be ok to make them available in product builds? Maybe 
>> as experiemental flags?
>
> Actually, it might be helpful to expose all four develop flags I added 
> to the user so that they can control different aspects of the heuristics:
>
> a) G1OldCSetRegionLiveThresholdPercent and 
> G1OldReclaimableThresholdPercent: control how much free space to 
> "sacrifice" in order to decrease mixed GC times (BTW, it'd be nice if 
> we could somehow combine these two.)
>
> b) G1MaxMixedGCNum: control how promptly G1 will collect the candidate 
> old regions: more promptly -> more free space is made available 
> quicker, less promptly -> each mixed GC might be shorter.
>
> c) G1OldCSetRegionThresholdPercent: control how aggressive G1 will be 
> in collecting old regions during each mixed GC: more aggressive -> 
> more free space is made available quicker, less promptly -> each mixed 
> GC might be shorter.
>
> (come to think about it, maybe we should think of combining the last 
> two too!)
>
> But before we expose these we need to a) get a better understanding of 
> what effect each has on G1's performance and b) find better names for 
> them. :-) (I didn't try very hard, as I assumed that they will be 
> develop parameters.)
>
>> g1CollectorPolicy.cpp:
>>
>>       if (next_gc_should_be_mixed("end of young GC")) {
>>         ergo_verbose0(ErgoMixedGCs,
>>                       "start mixed GCs",
>>                       ergo_format_reason("appropriate old regions 
>> available"));
>>         set_gcs_are_young(false);
>>       } else {
>>         ergo_verbose0(ErgoMixedGCs,
>>                       "do not start mixed GCs",
>>                       ergo_format_reason("appropriate old regions 
>> available"));
>>       }
>>
>> I think the ergo_verbose message in the else case at lines 1490-1492 
>> should say "appropriate old regions not available", right?
>
> Correct, thanks. John also caught that.
>
>> We log the ergo decisions in next_gc_should_be_mixed(). Will it not 
>> be duplicate information to log it in 
>> G1CollectorPolicy::record_collection_pause_end() as well?
>
> That's a good point. This is what the output looks like right now:
>
>  3.708: [G1Ergonomics (Mixed GCs) next GC should be mixed, reason: 
> appropriate old regions available, phase: end of young GC, remaining 
> old regions: 16 regions, reclaimable: 9803640 bytes (29.22 %), 
> threshold: 1.00 %]
>  3.708: [G1Ergonomics (Mixed GCs) start mixed GCs, reason: appropriate 
> old regions available]
>  23M->18M(32M), 0.0411940 secs]
>
> ...
>
>  3.938: [G1Ergonomics (Mixed GCs) next GC should be mixed, reason: 
> appropriate old regions available, phase: end of mixed GC, remaining 
> old regions: 12 regions, reclaimable: 5823976 bytes (17.36 %), 
> threshold: 1.00 %]
>  3.938: [G1Ergonomics (Mixed GCs) do not end mixed GCs, reason: 
> appropriate old regions available]
>  23M->15M(32M), 0.0322850 secs]
>
> I'll see how easy will be to parametarize next_gc_should_be_mixed() 
> with the "do" / "do not" action strings (instead of the phase_str).
>
> I'll send out a new webrev in a bit.
>
> Tony
>
>> Bengt
>>
>>
>> On 2012-02-14 16:18, Tony Printezis wrote:
>>> Hi all,
>>>
>>> I'd like a couple of (quick please!) code reviews for this change:
>>>
>>> http://cr.openjdk.java.net/~tonyp/7132029/webrev.0/
>>>
>>> The policy that was choosing old regions for collection during mixed 
>>> GCs was buggy and it could exhibit many strange behaviors, one of 
>>> which was to go into mixed GC mode, do "mixed GCs" (which they did 
>>> not actually collect any old regions), and not get out of it for a 
>>> while while preventing subsequent marking cycles not to start. This 
>>> frequently caused evac failures and Full GCs.
>>>
>>> The logic on which old regions to add to the CSet was spread to many 
>>> places. I simplified that and put it mainly in the loop in the 
>>> finalize_cset() method (renamed from choose_collection_set(), 
>>> "finalize" is more descriptive on what the method does). I think the 
>>> new version is easier to follow.
>>>
>>> Description of the changes:
>>>
>>> * I have introduced min and max number of old regions to be added to 
>>> the CSet of each mixed GC. Max number is calculated as a percentage 
>>> over the heap size (default: 10% - thanks to Jesper for suggesting 
>>> to use a heap percentage for this) and ensures that collections will 
>>> not get super long. Min number is calculated based on a desired 
>>> mixed GC num after a marking cycle (default: 4) and ensures that 
>>> each mixed GC will make some progress in collecting old regions (so 
>>> that the candidate old regions are collected in a timely fashion).
>>>
>>> * We now don't add any regions with live percentage over a threshold 
>>> (default: 95%) to the CSet chooser and we do not consider them for 
>>> collection.
>>>
>>> * I stopped using the cache in the CSet chooser class (it was used 
>>> to resort regions according to their latest GC efficiency), since 
>>> it's not necessary any more: we'll now go through all the candidate 
>>> old regions in the CSet chooser class and we don't have a heuristic 
>>> of when to stop mixed GCs based on GC efficiency.
>>>
>>> * I introduced the notion of "reclaimable bytes" on HeapRegions, 
>>> which not only includes the predicted garbage bytes on each region, 
>>> but also the unused space in the area [top,end) which will also be 
>>> reclaimed. The CSet chooser class now keeps track of the total 
>>> reclaimable bytes of all the regions that it tracks. If that falls 
>>> under a certain threshold (default: 1% of the heap) we stop doing 
>>> mixed GCs as we'll reclaim very little out of the remaining 
>>> candidate old regions.
>>>
>>> * I eliminated the case where a mixed GC starts and picks no old 
>>> regions to collect (I hope!). Now the information on the CSet 
>>> chooser (remaining region num / remaining reclaimable bytes) can 
>>> tell us whether we want to do more mixed GCs or not. If we do, it's 
>>> guaranteed that there will be old regions to collect. Because of 
>>> that, I removed the _should_revert_to_young_gcs flag as it's not 
>>> needed any more. It was used so that the CSet choice code could flag 
>>> that we should stop doing mixed GCs at the end of the GC. Now, we 
>>> can decide that by just looking at the information on the CSet 
>>> chooser (the heuristic is encapsulated in the 
>>> next_gc_should_be_mixed()).
>>>
>>> * I also changed the policy in the case where a fixed young gen is 
>>> used. Before, we were a bit arbitrary: during mixed GCs we'd cut the 
>>> young gen size in half and fill up the rest with old regions. I 
>>> thought that trying to re-use the "desired mixed GC number" 
>>> heuristic for this made sense. So, I now add the min number of old 
>>> regions to each mixed GC (as long as we don't go over the max). I 
>>> think this is a more reasonable and less arbitrary heuristic to what 
>>> we had before and it's more consistent with the non-fixed young gen 
>>> policy. I didn't know whether I should decrease the young gen size 
>>> for each mixed GC, like how we did before, and by how much. So, I 
>>> now leave it unchanged (the user decided they want a fixed young 
>>> gen, they will now always get it!).
>>>
>>> * Updated all related ergo output to print the new information.
>>>
>>> Like the marking changes, this change leaves a fair amount of unused 
>>> code that we can remove. I had to draw the line somewhere on how 
>>> much I should remove now and how much I should leave for later. I 
>>> opened a new CR for the remaining cleanup:
>>>
>>> 7145441: G1: collection set chooser-related cleanup
>>>
>>> Many thanks to Charlie for doing some last-minute performance 
>>> testing with my workspace. He was the one who discovered the problem 
>>> with the never-ending mixed GCs and this fix eliminated the problem. 
>>> Correctness-testing-wise, I've run overnight with the GC test suite.
>>>
>>> Thanks,
>>>
>>> Tony
>>>
>>>
>>