CMS: wrong max_eden_size for check_gc_overhead_limit
Galkin, Ivan
ivan.galkin at sap.com
Tue Nov 3 18:30:38 UTC 2015
Hello Jon,
CMS: You’ve recognized the patch as a correct fix and kindly proposed to sponsor it. Please see my first email or https://bugs.openjdk.java.net/browse/JDK-8135318 for the patch.
ParallelOld: I had some suspects but you’ve explained the code so there is no need to patch anything.
Thank you in advance,
Ivan
From: Jon Masamitsu [mailto:jon.masamitsu at oracle.com]
Sent: Montag, 2. November 2015 23:30
To: Volker Simonis; Galkin, Ivan
Cc: hotspot-gc-dev at openjdk.java.net
Subject: Re: CMS: wrong max_eden_size for check_gc_overhead_limit
Volker and Ivan,
Sorry for dropping the ball on this. There was a patch for CMS.
There was also some discussion about ParallelGC and a patch
but that ParallelGC patch should not be pushed, right?
Jon
On 10/26/2015 07:17 AM, Volker Simonis wrote:
Hi Jon, Ivan,
what's the status of this bug?
Are there still any changes required?
@Jon: as far as I understood, you wanted to sponsor this fix. Are
there still any changes required from your side?
Thank you and best regards,
Volker
On Mon, Sep 14, 2015 at 7:41 PM, Galkin, Ivan <ivan.galkin at sap.com><mailto:ivan.galkin at sap.com> wrote:
Hello Jon,
thank you for the explanation. I was worried about the possible underflow of
the following subtraction:
size_t max_eden_size = max_young_size -
young_gen->from_space()->capacity_in_bytes() -
young_gen->to_space()->capacity_in_bytes();
Is it correct that max_young_size >= young_gen->max_size() is always true?
(requires that old_gen can grow, but cannot shrink)
Otherwise there is a chance of underflow if SurvivorRatio==1.
Best regards,
Ivan
From: Jon Masamitsu [mailto:jon.masamitsu at oracle.com]
Sent: Freitag, 11. September 2015 18:58
To: Galkin, Ivan
Cc: hotspot-gc-dev at openjdk.java.net<mailto:hotspot-gc-dev at openjdk.java.net>
Subject: Re: CMS: wrong max_eden_size for check_gc_overhead_limit
On 9/10/2015 3:54 AM, Galkin, Ivan wrote:
Hello Jon,
thank you for your review and your support.
Besides CMS the function check_gc_overhead_limit() is called only in
PSMarkSweep, PSParallelCompact and PSScavenge:
* PSMarkSweep, PSParallelCompact use the formula “young_gen->max_size() -
young_gen->from_space()->capacity_in_bytes() -
young_gen->to_space()->capacity_in_bytes()”, which is correct because
PSYoungGen::max_size() == “size in bytes reserved for young generation”
* PSScavenge use a similar calculation, however both max_young_size and
survivor_limit are recalculated because of MinHeapFreeRation &
MaxHeapFreeRatio
// START SNIPPET FROM PSScavenge::invoke_no_policy(), see
hotspot/src/share/vm/gc/parallel/psScavenge.cpp
size_t max_young_size = young_gen->max_size();
...
if (MinHeapFreeRatio != 0 || MaxHeapFreeRatio != 100) {
max_young_size = MIN2(old_gen->capacity_in_bytes() / NewRatio,
young_gen->max_size());
}
...
size_t survivor_limit =
size_policy->max_survivor_size(max_young_size);
...
assert(young_gen->max_size() >
young_gen->from_space()->capacity_in_bytes() +
young_gen->to_space()->capacity_in_bytes(),
"Sizes of space in young gen are out-of-bounds");
This is a bit curious although the way the survivor spaces
grow and shrink with ParallelGC, I can imagine that it is
there to catch really bad calculations of the survivor
sizes. If the survivors grow larger than the current
eden size, it is not necessarily a bug, it's just not
useful (part of the survivor space would just never
be used). It's a fuzzy assertion but a safe one.
See my comment below on your alternative.
...
size_t max_eden_size = max_young_size -
young_gen->from_space()->capacity_in_bytes() -
young_gen->to_space()->capacity_in_bytes();
The survivor spaces can grow small (in principle down to
1 page) so I don't think survivor limit (which is their maximum
size relative to some young gen size) would give a maximum
eden size.
// END SNIPPET FROM PSScavenge::invoke_no_policy()
I’m not ably to say if there is an error in the calculation of
max_eden_size, however I’m wondering a) why recalculated survivor_limit is
not used and b) what is the marked assertion good for?
In my opinion the calculation should look as following. I didn’t test the
change. It’s only deduced, so every comment would be welcome.
diff -r f74b3ce62e1f src/share/vm/gc/parallel/psScavenge.cpp
--- a/src/share/vm/gc/parallel/psScavenge.cpp Fri Sep 04 17:33:56 2015 -0700
+++ b/src/share/vm/gc/parallel/psScavenge.cpp Thu Sep 10
12:30:19 2015 +0200
@@ -560,18 +560,14 @@
if (UseAdaptiveGenerationSizePolicyAtMinorCollection &&
(AdaptiveSizePolicy::should_update_eden_stats(gc_cause))) {
// Calculate optimal free space amounts
- assert(young_gen->max_size() >
- young_gen->from_space()->capacity_in_bytes() +
- young_gen->to_space()->capacity_in_bytes(),
+ assert(max_young_size > 2 * survivor_limit,
survivor_limit is calculated based on max_young_size
535 size_t survivor_limit =
536 size_policy->max_survivor_size(max_young_size);
so "max_young_size > 2*survivor_limit" shouldn't fail
unless MinSurvivorRatio is broken. Do you agree?
I'd be willing to delete the assertion if you like.
Jon
"Sizes of space in young gen are out-of-bounds");
size_t young_live = young_gen->used_in_bytes();
size_t eden_live = young_gen->eden_space()->used_in_bytes();
size_t cur_eden = young_gen->eden_space()->capacity_in_bytes();
size_t max_old_gen_size = old_gen->max_gen_size();
- size_t max_eden_size = max_young_size -
- young_gen->from_space()->capacity_in_bytes() -
- young_gen->to_space()->capacity_in_bytes();
+ size_t max_eden_size = max_young_size - 2 * survivor_limit;
// Used for diagnostics
size_policy->clear_generation_free_space_flags();
Best regard,
Ivan
P.S. Your questions about contributor agreement and a bug report were
answered by my colleague Volker Simonis
From: hotspot-gc-dev [mailto:hotspot-gc-dev-bounces at openjdk.java.net] On
Behalf Of Jon Masamitsu
Sent: Mittwoch, 9. September 2015 18:22
To: hotspot-gc-dev at openjdk.java.net<mailto:hotspot-gc-dev at openjdk.java.net>
Subject: Re: CMS: wrong max_eden_size for check_gc_overhead_limit
Ivan,
You're correct about this bug.
Is it correct that you're covered by the SAP contributor
agreement?
Have you filed a bug report on this?
Have you checked the other GC's for this problem?
I can sponsor this fix.
Thanks.
Jon
On 9/8/2015 2:21 AM, Galkin, Ivan wrote:
Hello all,
I believe the calculation of max_eden_size which is needed to check the GC
overhead in CMS is incorrect. Namely in concurrentMarkSweepGeneration.cpp
the following expression is used:
size_t max_eden_size = _young_gen->max_capacity() -
_young_gen->to()->capacity() - _young_gen->from()->capacity();
According to the implementation of DefNewGeneration::max_capacity() the
expression can be unfolded as:
size_t max_eden_size = (“reserved size of young gen” – “size
of survivor space”) – “size of survivor space” – “size of survivor space”;
So the value becomes too small (survival spaces are accounted 3 times!),
which can lead to the following problems:
1. max_eden_size is too small and GC overhead is wrongfully recognized
too early
2. max_eden_size == 0 (all young spaces have the same size;
-XX:SurvivorRatio=1) and GC overhead is never happens (see implementation of
AdaptiveSizePolicy::check_gc_overhead_limit:
a. max_eden_size == 0 leads to mem_free_eden_limit == 0;
b. free_in_eden < mem_free_eden_limit is always false, since both are
unsigned integers and mem_free_eden_limit is 0)
I would therefore suggest the following fix (DefNewGeneration::
max_eden_size() already contains the correctly calculated capacity of eden):
diff -r f74b3ce62e1f src/share/vm/gc/cms/concurrentMarkSweepGeneration.cpp
--- a/src/share/vm/gc/cms/concurrentMarkSweepGeneration.cpp Fri Sep
04 17:33:56 2015 -0700
+++ b/src/share/vm/gc/cms/concurrentMarkSweepGeneration.cpp Mon Sep 07
18:08:39 2015 +0200
@@ -1563,9 +1563,7 @@
do_compaction_work(clear_all_soft_refs);
// Has the GC time limit been exceeded?
- size_t max_eden_size = _young_gen->max_capacity() -
- _young_gen->to()->capacity() -
- _young_gen->from()->capacity();
+ size_t max_eden_size = _young_gen->max_eden_size();
GCCause::Cause gc_cause = gch->gc_cause();
size_policy()->check_gc_overhead_limit(_young_gen->used(),
_young_gen->eden()->used(),
Could anybody please review and sponsor the change?
Thank you in advance,
Ivan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20151103/a53ecbba/attachment.htm>
More information about the hotspot-gc-dev
mailing list