RFR(S): 7185699: G1: Prediction model discrepancies
John Cuthbertson
john.cuthbertson at oracle.com
Wed Jul 25 18:39:32 UTC 2012
Hi Thomas,
On 7/25/2012 2:14 AM, Thomas Schatzl wrote:
> here are some notes, although I am not an official reviewer...
You don't need to be an official reviewer to review the code and send
feedback. A review from an official reviewer is needed in order to push
the code. Since you don't (yet) have an OpenJDK username I can't list
you as a reviewer - but you're feedback is welcome.
> G1CollectedHeap::max_pending_card_num() just below of the fixed
> pending_card_num() seems to have the same issue.
>
Thanks. Good catch (but see below).
> I also think G1CollectorPolicy::predict_pending_cards() and
> G1CollectorPolicy::predict_pending_card_diff() can be removed, they do
> not seem to be referenced anywhere in the code.
>
Yes. They don't appear to be referenced and I've removed them. This also
resulted in some other routines, including max_pending_card_num(), and
some fields becoming unused. I've removed those too.
>> As a result we were vastly overestimating
>> the base pause time. The second was in the code that calculates the GC
> I think the effects mostly cancelled themselves out long term due to the
> way the prediction is calculated: the time prediction is based on a
> prediction of pending_cards times a cost prediction per card.
> Pending_cards is too high by the oopSize as per your patch, but the cost
> prediction per card has been derived from a value based on the rset
> update time divided by the number of pending cards, which was too high
> by that same factor.
>
> (I haven't measured anything)
OK. Your analysis makes sense. In the instrumentation I added (and
removed), however, I see much smaller values for base_time_ms but the
tests I ran don't approach anywhere near "long term".
>> efficiency of a non-young region. The GC efficiency of a region is
>> calculated at the end of marking. The code, however, was using the wrong
>> routine to predict the number of cards for a given RSet length. It was
>> using the data collected for young GCs when it should have been using
>> the data collected for mixed GCs.
> I have one question about a probable use of predictors of different
> types of collection. In particular, during mixed collections, the
> predict_region_elapsed_time_ms() method is also called to determine
> young generation regions evacuation time predictions, with the "young"
> parameter set to true always.
>
> Looking at the code (copied below for convenience), during that time the
> number of cards prediction for young gen regions is based on the
> prediction of mixed collection card lengths (as expected), but at the
> same time, the "other" time prediction is based on the time prediction
> for the young collections.
>
> This seems odd, because I'd expect both prediction components to be
> based on the same prior decision at all times.
>
> Maybe you could improve these predictions for the young gen regions by
> always base them on the young collection predictors. This assumes that
> young gen region timing is the same independent on the type of
> collection - but still the same for both card length and "other" time.
> Wouldn't it be better to base the selection of predictions exclusively
> on the "young" parameter, and eventually pass gcs_are_young() as value
> for the "young" parameter appropriately?
>
> This would also automatically remove the need for the inlined code in
> HeapRegion::calc_gc_efficiency, which seems just a source for errors. I
> doubt anybody reads the "synchronize me" comments in heapRegion.cpp when
> changing the prediction code in g1CollectorPolicy.cpp. Additionally this
> calculation seems to be some policy that would seem perfect for putting
> the method into G1CollectorPolicy.
> At least then the "synchronize me" comment could be in the same file, or
> even factored out into a single method.
Good observation. I'm not an expert in this code but I believe the
following code is the reason:
double young_other_time_ms = 0.0;
if (young_cset_region_length() > 0) {
young_other_time_ms =
phase_times()->_recorded_young_cset_choice_time_ms +
phase_times()->_recorded_young_free_cset_time_ms;
_young_other_cost_per_region_ms_seq->add(young_other_time_ms /
(double)
young_cset_region_length());
}
double non_young_other_time_ms = 0.0;
if (old_cset_region_length() > 0) {
non_young_other_time_ms =
phase_times()->_recorded_non_young_cset_choice_time_ms +
phase_times()->_recorded_non_young_free_cset_time_ms;
_non_young_other_cost_per_region_ms_seq->add(non_young_other_time_ms /
(double)
old_cset_region_length());
}
double constant_other_time_ms = all_other_time_ms -
(young_other_time_ms + non_young_other_time_ms);
_constant_other_time_ms_seq->add(constant_other_time_ms);
It looks like we're keeping track of how long it takes to add a region
to the collection set and how long it takes to free a region in the
collection set (based upon the region type). Haven't you shown that
freeing old regions during mixed GCs takes a lot longer than freeing
young regions? So choosing the other time for a region (vs. the base
other time for the pause) is a function of the region's type and not the
GC type. Though I'm not sure why we pass it as a parameter. I'll play
around with refactoring the code to eliminate inlining the code into
calc_gc_efficiency().
Thanks,
JohnC
More information about the hotspot-gc-dev
mailing list