Request for review (m) - 8008508: CMS does not correctly reduce heap size after a Full GC

Mon Mar 4 23:18:19 UTC 2013

Ramki,

I think that originally CMSGen and TenuredGen shared
nearly the same policy but didn't share enough code.

The existing CMS policy is

1) if the incremental collection failed, expand to reserved space.

else

2) use MinHeapFreeRatio to decide if generation
should be expanded and by how much

else

3) don't know how to shrink so don't

I think that the CardGeneration::compute_new_size()
(formerly the TenuredGeneration::compute_new_size())
does the same as 2) and that for a full GC  we
know how to shrink the generation and the
CardGeneration::compute_new_size() does that
when CMS does a full GC.

So for the CMS compute_new_size(), keep 1) and
replace 2) and 3) with a call to CardGeneration::compute_new_size().

I'll go back and verify that the original
CMS compute_new_size() uses MinHeapFreeRatio
the same way as CardGeneration::compute_new_size()
does.  If they are the same, I'll try to refactor it so
that they actually share more code.

A user that wants more head room in the CMS generation would
have to specify a MinHeapFreeRatio and that value would
be used by CardGeneration::compute_new_size() with my
change just as it is used in the original CMS compute_new_size().

Jon

On 03/04/13 11:02, Srinivas Ramakrishna wrote:
> KInd of.... What I am saying is that the sizing policy should be based
> on what CMS wants, not
> what CardGeneration might want. Besides how does card generation know
> what specific policy
> a concrete subclass would want. The policy should be determined by the
> "leaf class" (or suitably
> over-ridden as needed) -- in this case by CMS. I see that you have
> here decided that the policy
> provided by CardGeneration is sufficient for when a concurrent mode
> failure happens. I was not sure
> if that would necessarily be the case, and perhaps more thought is
> needed to come up with
> a good _policy_ specific for CMS, and use the _mechanics_ provided by
> CardGeneration to
> affect that change. CMS _should_ distinguish between whether a
> concurrent mode failure occurred
> or not to determine  how its heap should be sized, but it should
> probably be a policy that is
> cognizant of the historical metrics of CMS, which may or may not be
> used by or available
> to CardGeneration, and make decisions specific to CMS, which
> CardGeneration does not know about.
>
> May be I am splitting hairs here, but this is basically where
> mechanism, if common, might be
> provided by superclasses but policy by concrete subclasses (or
> suitably overridden when
> appropriate -- may be in this case you decided that when there is a
> compacting collection,
> the policy provided by the superclass is good enough; I was not sure
> if that was a deliberate
> decision -- I have not looked closely at either policy to see if it
> makes sense. Consider, for example,
> that CMS usually has larger free space needs than contiguous space
> collectors, so it makes
> sense to me that its sizing policies would be different and keep more
> free space available based
> on the application's historical behaviour, than would contiguously
> allocating collectors.)
>
> Does my concern make sense?
>
> -- ramki
>
> On Mon, Mar 4, 2013 at 10:21 AM, Jon Masamitsu<jon.masamitsu at oracle.com>  wrote:
>> Ramki,
>>
>> Thanks for taking a look.  Let's talk about the high level
>> comment before the details.
>>
>> Are you saying that the amount of desired free space in the
>> cms generation after a collection should be independent of
>> whether the collection was a concurrent collection or a
>> full STW collection?
>>
>> Jon
>>
>>
>>
>> On 03/02/13 00:42, Srinivas Ramakrishna wrote:
>>> Hi Jon --
>>>
>>> I'll give a couple of code-level comments, but there's a high-level
>>> comment at the end, which
>>> might be worth pondering over.
>>>
>>> On Tue, Feb 26, 2013 at 3:47 PM, Jon Masamitsu<jon.masamitsu at oracle.com>
>>> wrote:
>>>> I've updated the webrev for review comments (thanks, JohnCu)
>>>>
>>>> http://cr.openjdk.java.net/~jmasa/8008508/webrev.01/
>>>>
>>> concurrent MarkSweepGeneration.cpp:
>>>
>>>    949   // compute expansion delta needed for reaching desired free
>>> percentage
>>>    950   if (free_percentage<   desired_free_percentage) {
>>>    951     size_t desired_capacity = (size_t)(used() / ((double) 1 -
>>> desired_free_percentage));
>>>    952     assert(desired_capacity>= capacity(), "invalid expansion size");
>>>    953     size_t expand_bytes = MAX2(desired_capacity - capacity(),
>>> MinHeapDeltaBytes);
>>>
>>>    954     if (expand_bytes>   0) {
>>>
>>> The if test at 954 (in your changed code) will always be true because
>>> MinHeapDeltaBytes>   0
>>> and we are taking the max at line 953.
>>>
>>> On reading your shrinking code further below at lines 989-994, it
>>> struck me that there's a small
>>> existing bug in the shrink/expand code here. It's probably more of a
>>> factor in the shrink code though
>>> because we can live with more free space than we'd ideally want, but
>>> not less than we ideally want,
>>> and we certainly do not want to chatter around the desired free
>>> percentage (hence the delta).
>>>
>>> Your shrinking arm looks like this:-
>>>
>>>    989   } else {
>>>    990     size_t desired_capacity = (size_t)(used() / ((double) 1 -
>>> desired_free_percentage));
>>>    991     assert(desired_capacity<= capacity(), "invalid expansion size");
>>>    992     size_t shrink_bytes = MAX2(capacity() - desired_capacity,
>>> MinHeapDeltaBytes);
>>>    993     shrink_free_list_by(shrink_bytes);
>>>    994   }
>>>
>>> I'd modify it to:-
>>>
>>>    989   } else {
>>>    990     size_t desired_capacity = (size_t)(used() / ((double) 1 -
>>> desired_free_percentage));
>>>    991     assert(desired_capacity<= capacity(), "invalid expansion size");
>>>               size_t shrink_bytes = capacity() - desired_capacity;
>>>               // Don't shrink unless the delta is greater than the
>>> minimum shrink we want
>>>    992     if (shrink_bytes>= MinHeapDeltaBytes) {
>>>    993         shrink_free_list_by(shrink_bytes);
>>>               }
>>>    994   }
>>>
>>> Note the asymmetry between expansion and shrinking, where we expand by
>>> the min if we feel we are below the desired free, but we don't shrink
>>> even if we are above desired free unless the gap exceeds the min
>>> delta.
>>>
>>> This seems to be the correct policy.  (Of course none of that is moot
>>> until shrink_free_list_by() starts doing shrinking, but probably best
>>> to fix the policy now, so that it doesn't cause chatter later.
>>>
>>> concurrentMarkSweepGeneration.hpp:
>>>
>>> 1298   virtual void compute_new_size();
>>> 1299   void compute_new_size_free_list();
>>>
>>> I think it would be good to add a 1-line doc comment for each of the
>>> above methods.
>>>
>>> vmStructs.cpp:
>>>
>>> I think you also have to adjust the fields _capacity_at_prologue and
>>> _used_at_prologue into the superclass. I'd also move the decls up to
>>> CardGeneration, since that's how the original code is laid out.
>>>
>>> Finally, a high level comment. While I see the reason for these
>>> mechanics, I worry a little bit
>>> about the somewhat schizophrenic expansion/shrinkage policy, as it
>>> swings between one policy
>>> used when we are presumably doing well and keeping up with the
>>> application's
>>> allocation/promotion rates and not running into concurrent mode
>>> failure, but then suddenly switching
>>> to a completely different expansion/shrinking policy when we do a
>>> compaction (presumably because
>>> we had a concurrent mode failure). I'd much rather have divorced the
>>> two, by using a more uniform
>>> policy specifically for CMS, perhaps predicated by whether this was
>>> happening after a
>>> compaction or not, but keeping it separate from the policy that might
>>> be used for stop-world
>>>    kind of collectors (hence TenuredGeneraion).
>>>
>>> I'd appreciate it if you could give some thought to this final high
>>> level comment regarding the
>>> general policy approach taken here. (Also it is somewhat troubling to
>>> push policy so high up
>>> into the class hierarchy, so that every subclass must live with it;
>>> I'd much rather the policy
>>> were separated from the mechanics of what would be done when an
>>> expansion or shrinkage
>>> was needed.)
>>>
>>> -- ramki