CMS: wrong max_eden_size for check_gc_overhead_limit

Jon Masamitsu jon.masamitsu at oracle.com
Mon Nov 2 22:29:30 UTC 2015


Volker and Ivan,

Sorry for dropping the ball on this.  There was a patch for CMS.
There was also some discussion about ParallelGC and a patch
but that ParallelGC patch should not be pushed, right?

Jon



On 10/26/2015 07:17 AM, Volker Simonis wrote:
> Hi Jon, Ivan,
>
> what's the status of this bug?
>
> Are there still any changes required?
>
> @Jon: as far as I understood, you wanted to sponsor this fix. Are
> there still any changes required from your side?
>
> Thank you and best regards,
> Volker
>
>
> On Mon, Sep 14, 2015 at 7:41 PM, Galkin, Ivan <ivan.galkin at sap.com> wrote:
>> Hello Jon,
>>
>>
>>
>> thank you for the explanation. I was worried about the possible underflow of
>> the following subtraction:
>>
>>
>>
>>            size_t max_eden_size = max_young_size -
>>
>>              young_gen->from_space()->capacity_in_bytes() -
>>
>>              young_gen->to_space()->capacity_in_bytes();
>>
>>
>>
>> Is it correct that max_young_size >= young_gen->max_size() is always true?
>> (requires that old_gen can grow, but cannot shrink)
>>
>> Otherwise there is a chance of underflow if SurvivorRatio==1.
>>
>>
>>
>> Best regards,
>>
>> Ivan
>>
>>
>>
>> From: Jon Masamitsu [mailto:jon.masamitsu at oracle.com]
>> Sent: Freitag, 11. September 2015 18:58
>> To: Galkin, Ivan
>> Cc: hotspot-gc-dev at openjdk.java.net
>>
>>
>> Subject: Re: CMS: wrong max_eden_size for check_gc_overhead_limit
>>
>>
>>
>>
>>
>> On 9/10/2015 3:54 AM, Galkin, Ivan wrote:
>>
>> Hello Jon,
>>
>>
>>
>> thank you for your review and your support.
>>
>>
>>
>> Besides CMS the function check_gc_overhead_limit() is called only in
>> PSMarkSweep, PSParallelCompact and PSScavenge:
>>
>>
>>
>> *  PSMarkSweep, PSParallelCompact use the formula “young_gen->max_size() -
>> young_gen->from_space()->capacity_in_bytes() -
>> young_gen->to_space()->capacity_in_bytes()”, which is correct because
>> PSYoungGen::max_size() == “size in bytes reserved for young generation”
>>
>>
>>
>> *  PSScavenge use a similar calculation, however both max_young_size and
>> survivor_limit are recalculated because of MinHeapFreeRation &
>> MaxHeapFreeRatio
>>
>>
>>
>> // START SNIPPET FROM PSScavenge::invoke_no_policy(), see
>> hotspot/src/share/vm/gc/parallel/psScavenge.cpp
>>
>>          size_t max_young_size = young_gen->max_size();
>>
>>          ...
>>
>>          if (MinHeapFreeRatio != 0 || MaxHeapFreeRatio != 100) {
>>
>>            max_young_size = MIN2(old_gen->capacity_in_bytes() / NewRatio,
>> young_gen->max_size());
>>
>>          }
>>
>>          ...
>>
>>          size_t survivor_limit =
>>
>>            size_policy->max_survivor_size(max_young_size);
>>
>>          ...
>>
>>            assert(young_gen->max_size() >
>>
>>              young_gen->from_space()->capacity_in_bytes() +
>>
>>              young_gen->to_space()->capacity_in_bytes(),
>>
>>              "Sizes of space in young gen are out-of-bounds");
>>
>>
>> This is a bit curious although the way the survivor spaces
>> grow and shrink with ParallelGC, I can imagine that it is
>> there to catch really bad calculations of the survivor
>> sizes.  If the survivors grow larger than the current
>> eden size, it is not necessarily a bug, it's just not
>> useful (part of the survivor space would just never
>> be used).   It's a fuzzy assertion but a safe one.
>> See my comment below on your alternative.
>>
>>          ...
>>
>>            size_t max_eden_size = max_young_size -
>>
>>              young_gen->from_space()->capacity_in_bytes() -
>>
>>              young_gen->to_space()->capacity_in_bytes();
>>
>>
>> The survivor spaces can grow small (in principle down to
>> 1 page) so I don't think survivor limit (which is their maximum
>> size relative to some young gen size) would give a maximum
>> eden size.
>>
>>
>> // END SNIPPET FROM PSScavenge::invoke_no_policy()
>>
>>
>>
>> I’m not ably to say if there is an error in the calculation of
>> max_eden_size, however I’m wondering a) why recalculated survivor_limit is
>> not used and b) what is the marked assertion good for?
>>
>> In my opinion the calculation should look as following. I didn’t test the
>> change. It’s only deduced, so every comment would be welcome.
>>
>>
>>
>> diff -r f74b3ce62e1f src/share/vm/gc/parallel/psScavenge.cpp
>>
>> --- a/src/share/vm/gc/parallel/psScavenge.cpp Fri Sep 04 17:33:56 2015 -0700
>>
>> +++ b/src/share/vm/gc/parallel/psScavenge.cpp             Thu Sep 10
>> 12:30:19 2015 +0200
>>
>> @@ -560,18 +560,14 @@
>>
>>           if (UseAdaptiveGenerationSizePolicyAtMinorCollection &&
>>
>>               (AdaptiveSizePolicy::should_update_eden_stats(gc_cause))) {
>>
>>             // Calculate optimal free space amounts
>>
>> -          assert(young_gen->max_size() >
>>
>> -            young_gen->from_space()->capacity_in_bytes() +
>>
>> -            young_gen->to_space()->capacity_in_bytes(),
>>
>> +          assert(max_young_size > 2 * survivor_limit,
>>
>>
>> survivor_limit is calculated based on max_young_size
>>
>> 535         size_t survivor_limit =
>> 536           size_policy->max_survivor_size(max_young_size);
>>
>> so "max_young_size > 2*survivor_limit"  shouldn't fail
>> unless MinSurvivorRatio is broken.  Do you agree?
>>
>> I'd be willing to delete the assertion if you like.
>>
>> Jon
>>
>>
>>               "Sizes of space in young gen are out-of-bounds");
>>
>>             size_t young_live = young_gen->used_in_bytes();
>>
>>             size_t eden_live = young_gen->eden_space()->used_in_bytes();
>>
>>             size_t cur_eden = young_gen->eden_space()->capacity_in_bytes();
>>
>>             size_t max_old_gen_size = old_gen->max_gen_size();
>>
>> -          size_t max_eden_size = max_young_size -
>>
>> -            young_gen->from_space()->capacity_in_bytes() -
>>
>> -            young_gen->to_space()->capacity_in_bytes();
>>
>> +          size_t max_eden_size = max_young_size - 2 * survivor_limit;
>>
>>             // Used for diagnostics
>>
>>             size_policy->clear_generation_free_space_flags();
>>
>>
>>
>> Best regard,
>>
>> Ivan
>>
>>
>>
>> P.S. Your questions about contributor agreement and a bug report were
>> answered by my colleague Volker Simonis
>>
>>
>>
>>
>>
>> From: hotspot-gc-dev [mailto:hotspot-gc-dev-bounces at openjdk.java.net] On
>> Behalf Of Jon Masamitsu
>> Sent: Mittwoch, 9. September 2015 18:22
>> To: hotspot-gc-dev at openjdk.java.net
>> Subject: Re: CMS: wrong max_eden_size for check_gc_overhead_limit
>>
>>
>>
>> Ivan,
>>
>> You're correct about this bug.
>>
>> Is it correct that you're covered by the SAP contributor
>> agreement?
>>
>> Have you filed a bug report on this?
>>
>> Have you checked the other GC's for this problem?
>>
>> I can sponsor this fix.
>>
>> Thanks.
>>
>> Jon
>>
>>
>>
>> On 9/8/2015 2:21 AM, Galkin, Ivan wrote:
>>
>> Hello all,
>>
>>
>>
>> I believe the calculation of max_eden_size which is needed to check the GC
>> overhead in CMS is incorrect. Namely in concurrentMarkSweepGeneration.cpp
>> the following expression is used:
>>
>>
>>
>> size_t max_eden_size = _young_gen->max_capacity() -
>> _young_gen->to()->capacity() - _young_gen->from()->capacity();
>>
>>
>>
>> According to the implementation of DefNewGeneration::max_capacity() the
>> expression can be unfolded as:
>>
>>
>>
>>                  size_t max_eden_size = (“reserved size of young gen” – “size
>> of survivor space”) – “size of survivor space” – “size of survivor space”;
>>
>>
>>
>> So the value becomes too small (survival spaces are accounted 3 times!),
>> which can lead to the following problems:
>>
>>
>>
>> 1.       max_eden_size is too small and GC overhead is wrongfully recognized
>> too early
>>
>> 2.       max_eden_size == 0 (all young spaces have the same size;
>> -XX:SurvivorRatio=1) and GC overhead is never happens (see implementation of
>> AdaptiveSizePolicy::check_gc_overhead_limit:
>>
>> a.       max_eden_size == 0 leads to mem_free_eden_limit == 0;
>>
>> b.      free_in_eden < mem_free_eden_limit is always false, since both are
>> unsigned integers and mem_free_eden_limit is 0)
>>
>>
>>
>> I would therefore suggest the following fix (DefNewGeneration::
>> max_eden_size() already contains the correctly calculated capacity of eden):
>>
>>
>>
>> diff -r f74b3ce62e1f src/share/vm/gc/cms/concurrentMarkSweepGeneration.cpp
>>
>> --- a/src/share/vm/gc/cms/concurrentMarkSweepGeneration.cpp         Fri Sep
>> 04 17:33:56 2015 -0700
>>
>> +++ b/src/share/vm/gc/cms/concurrentMarkSweepGeneration.cpp      Mon Sep 07
>> 18:08:39 2015 +0200
>>
>> @@ -1563,9 +1563,7 @@
>>
>>     do_compaction_work(clear_all_soft_refs);
>>
>>
>>
>>     // Has the GC time limit been exceeded?
>>
>> -  size_t max_eden_size = _young_gen->max_capacity() -
>>
>> -                         _young_gen->to()->capacity() -
>>
>> -                         _young_gen->from()->capacity();
>>
>> +  size_t max_eden_size = _young_gen->max_eden_size();
>>
>>     GCCause::Cause gc_cause = gch->gc_cause();
>>
>>     size_policy()->check_gc_overhead_limit(_young_gen->used(),
>>
>>                                            _young_gen->eden()->used(),
>>
>>
>>
>> Could anybody please review and sponsor the change?
>>
>>
>>
>> Thank you in advance,
>>
>> Ivan
>>
>>
>>
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20151102/b934bbcf/attachment.htm>


More information about the hotspot-gc-dev mailing list