CMS: wrong max_eden_size for check_gc_overhead_limit

Galkin, Ivan ivan.galkin at sap.com
Tue Nov 3 18:30:38 UTC 2015


Hello Jon,

CMS: You’ve recognized the patch as a correct fix and kindly proposed to sponsor it. Please see my first email or https://bugs.openjdk.java.net/browse/JDK-8135318 for the patch.
ParallelOld: I had some suspects but you’ve explained the code so there is no need to patch anything.

Thank you in advance,
Ivan

From: Jon Masamitsu [mailto:jon.masamitsu at oracle.com]
Sent: Montag, 2. November 2015 23:30
To: Volker Simonis; Galkin, Ivan
Cc: hotspot-gc-dev at openjdk.java.net
Subject: Re: CMS: wrong max_eden_size for check_gc_overhead_limit

Volker and Ivan,

Sorry for dropping the ball on this.  There was a patch for CMS.
There was also some discussion about ParallelGC and a patch
but that ParallelGC patch should not be pushed, right?

Jon


On 10/26/2015 07:17 AM, Volker Simonis wrote:

Hi Jon, Ivan,



what's the status of this bug?



Are there still any changes required?



@Jon: as far as I understood, you wanted to sponsor this fix. Are

there still any changes required from your side?



Thank you and best regards,

Volker





On Mon, Sep 14, 2015 at 7:41 PM, Galkin, Ivan <ivan.galkin at sap.com><mailto:ivan.galkin at sap.com> wrote:

Hello Jon,







thank you for the explanation. I was worried about the possible underflow of

the following subtraction:







          size_t max_eden_size = max_young_size -



            young_gen->from_space()->capacity_in_bytes() -



            young_gen->to_space()->capacity_in_bytes();







Is it correct that max_young_size >= young_gen->max_size() is always true?

(requires that old_gen can grow, but cannot shrink)



Otherwise there is a chance of underflow if SurvivorRatio==1.







Best regards,



Ivan







From: Jon Masamitsu [mailto:jon.masamitsu at oracle.com]

Sent: Freitag, 11. September 2015 18:58

To: Galkin, Ivan

Cc: hotspot-gc-dev at openjdk.java.net<mailto:hotspot-gc-dev at openjdk.java.net>





Subject: Re: CMS: wrong max_eden_size for check_gc_overhead_limit











On 9/10/2015 3:54 AM, Galkin, Ivan wrote:



Hello Jon,







thank you for your review and your support.







Besides CMS the function check_gc_overhead_limit() is called only in

PSMarkSweep, PSParallelCompact and PSScavenge:







*  PSMarkSweep, PSParallelCompact use the formula “young_gen->max_size() -

young_gen->from_space()->capacity_in_bytes() -

young_gen->to_space()->capacity_in_bytes()”, which is correct because

PSYoungGen::max_size() == “size in bytes reserved for young generation”







*  PSScavenge use a similar calculation, however both max_young_size and

survivor_limit are recalculated because of MinHeapFreeRation &

MaxHeapFreeRatio







// START SNIPPET FROM PSScavenge::invoke_no_policy(), see

hotspot/src/share/vm/gc/parallel/psScavenge.cpp



        size_t max_young_size = young_gen->max_size();



        ...



        if (MinHeapFreeRatio != 0 || MaxHeapFreeRatio != 100) {



          max_young_size = MIN2(old_gen->capacity_in_bytes() / NewRatio,

young_gen->max_size());



        }



        ...



        size_t survivor_limit =



          size_policy->max_survivor_size(max_young_size);



        ...



          assert(young_gen->max_size() >



            young_gen->from_space()->capacity_in_bytes() +



            young_gen->to_space()->capacity_in_bytes(),



            "Sizes of space in young gen are out-of-bounds");





This is a bit curious although the way the survivor spaces

grow and shrink with ParallelGC, I can imagine that it is

there to catch really bad calculations of the survivor

sizes.  If the survivors grow larger than the current

eden size, it is not necessarily a bug, it's just not

useful (part of the survivor space would just never

be used).   It's a fuzzy assertion but a safe one.

See my comment below on your alternative.



        ...



          size_t max_eden_size = max_young_size -



            young_gen->from_space()->capacity_in_bytes() -



            young_gen->to_space()->capacity_in_bytes();





The survivor spaces can grow small (in principle down to

1 page) so I don't think survivor limit (which is their maximum

size relative to some young gen size) would give a maximum

eden size.





// END SNIPPET FROM PSScavenge::invoke_no_policy()







I’m not ably to say if there is an error in the calculation of

max_eden_size, however I’m wondering a) why recalculated survivor_limit is

not used and b) what is the marked assertion good for?



In my opinion the calculation should look as following. I didn’t test the

change. It’s only deduced, so every comment would be welcome.







diff -r f74b3ce62e1f src/share/vm/gc/parallel/psScavenge.cpp



--- a/src/share/vm/gc/parallel/psScavenge.cpp Fri Sep 04 17:33:56 2015 -0700



+++ b/src/share/vm/gc/parallel/psScavenge.cpp             Thu Sep 10

12:30:19 2015 +0200



@@ -560,18 +560,14 @@



         if (UseAdaptiveGenerationSizePolicyAtMinorCollection &&



             (AdaptiveSizePolicy::should_update_eden_stats(gc_cause))) {



           // Calculate optimal free space amounts



-          assert(young_gen->max_size() >



-            young_gen->from_space()->capacity_in_bytes() +



-            young_gen->to_space()->capacity_in_bytes(),



+          assert(max_young_size > 2 * survivor_limit,





survivor_limit is calculated based on max_young_size



535         size_t survivor_limit =

536           size_policy->max_survivor_size(max_young_size);



so "max_young_size > 2*survivor_limit"  shouldn't fail

unless MinSurvivorRatio is broken.  Do you agree?



I'd be willing to delete the assertion if you like.



Jon





             "Sizes of space in young gen are out-of-bounds");



           size_t young_live = young_gen->used_in_bytes();



           size_t eden_live = young_gen->eden_space()->used_in_bytes();



           size_t cur_eden = young_gen->eden_space()->capacity_in_bytes();



           size_t max_old_gen_size = old_gen->max_gen_size();



-          size_t max_eden_size = max_young_size -



-            young_gen->from_space()->capacity_in_bytes() -



-            young_gen->to_space()->capacity_in_bytes();



+          size_t max_eden_size = max_young_size - 2 * survivor_limit;



           // Used for diagnostics



           size_policy->clear_generation_free_space_flags();







Best regard,



Ivan







P.S. Your questions about contributor agreement and a bug report were

answered by my colleague Volker Simonis











From: hotspot-gc-dev [mailto:hotspot-gc-dev-bounces at openjdk.java.net] On

Behalf Of Jon Masamitsu

Sent: Mittwoch, 9. September 2015 18:22

To: hotspot-gc-dev at openjdk.java.net<mailto:hotspot-gc-dev at openjdk.java.net>

Subject: Re: CMS: wrong max_eden_size for check_gc_overhead_limit







Ivan,



You're correct about this bug.



Is it correct that you're covered by the SAP contributor

agreement?



Have you filed a bug report on this?



Have you checked the other GC's for this problem?



I can sponsor this fix.



Thanks.



Jon







On 9/8/2015 2:21 AM, Galkin, Ivan wrote:



Hello all,







I believe the calculation of max_eden_size which is needed to check the GC

overhead in CMS is incorrect. Namely in concurrentMarkSweepGeneration.cpp

the following expression is used:







size_t max_eden_size = _young_gen->max_capacity() -

_young_gen->to()->capacity() - _young_gen->from()->capacity();







According to the implementation of DefNewGeneration::max_capacity() the

expression can be unfolded as:







                size_t max_eden_size = (“reserved size of young gen” – “size

of survivor space”) – “size of survivor space” – “size of survivor space”;







So the value becomes too small (survival spaces are accounted 3 times!),

which can lead to the following problems:







1.       max_eden_size is too small and GC overhead is wrongfully recognized

too early



2.       max_eden_size == 0 (all young spaces have the same size;

-XX:SurvivorRatio=1) and GC overhead is never happens (see implementation of

AdaptiveSizePolicy::check_gc_overhead_limit:



a.       max_eden_size == 0 leads to mem_free_eden_limit == 0;



b.      free_in_eden < mem_free_eden_limit is always false, since both are

unsigned integers and mem_free_eden_limit is 0)







I would therefore suggest the following fix (DefNewGeneration::

max_eden_size() already contains the correctly calculated capacity of eden):







diff -r f74b3ce62e1f src/share/vm/gc/cms/concurrentMarkSweepGeneration.cpp



--- a/src/share/vm/gc/cms/concurrentMarkSweepGeneration.cpp         Fri Sep

04 17:33:56 2015 -0700



+++ b/src/share/vm/gc/cms/concurrentMarkSweepGeneration.cpp      Mon Sep 07

18:08:39 2015 +0200



@@ -1563,9 +1563,7 @@



   do_compaction_work(clear_all_soft_refs);







   // Has the GC time limit been exceeded?



-  size_t max_eden_size = _young_gen->max_capacity() -



-                         _young_gen->to()->capacity() -



-                         _young_gen->from()->capacity();



+  size_t max_eden_size = _young_gen->max_eden_size();



   GCCause::Cause gc_cause = gch->gc_cause();



   size_policy()->check_gc_overhead_limit(_young_gen->used(),



                                          _young_gen->eden()->used(),







Could anybody please review and sponsor the change?







Thank you in advance,



Ivan









-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20151103/a53ecbba/attachment.htm>


More information about the hotspot-gc-dev mailing list