RFR (M): 8137082: Factor out G1 prediction code from G1CollectorPolicy and clean up

Thu Oct 1 14:22:27 UTC 2015

Hi Kim,

  thanks for looking at this.

On Wed, 2015-09-30 at 20:37 -0400, Kim Barrett wrote:
> On Sep 24, 2015, at 9:52 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> > 
> >  can I have reviews for this cleanup change?
> > 
> > […]
> > So, summing it up, there is no functionality change in this CR, just
> > some cleanup.
> > 
> > CR:
> > https://bugs.openjdk.java.net/browse/JDK-8137082
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8137082/webrev/
> > Testing:
> > jprt
> 
> I started to look at this change, and quickly became confused.  I feel
> like I need some clarifications about the prediction's confidence
> factor before proceeding.
> 
> ------------------------------------------------------------------------------  
> src/share/vm/gc/g1/g1Predictions.hpp 
>   35   // A function that prevents us putting too much stock in small sample
>   36   // sets.  Returns a number between 2.0 and 1.0, depending on the number
>   37   // of samples.  5 or more samples yields one; fewer scales linearly from
>   38   // 2.0 at 1 sample to 1.0 at 5.
>   39   double confidence_factor(int samples) const {
> 
> It seems strange to have a confidence value that decreases with more
> samples.  One would think that means that it is intended to be used as
> a divisor rather than a multiplier.  But then I see the following:

It's not a confidence value that is typically expressed in percent, but
a factor as the name implies.
Its use explains why it is decaying with the amount of samples
available.

> 
>   52   double get_new_prediction(TruncatedSeq const* seq) const{
>   53     return MAX2(seq->davg() + _sigma * seq->dsd(),
>   54                 seq->davg() * confidence_factor(seq->num()));
>   55   }
>
> and I'm confused.  So having few values increases the prediction?
> That seems like an odd policy.

It makes at least some sense to me. Let's break down the values of the
two terms and their use:

  seq->davg() + _sigma * seq->dsd()

is intended to give a (conservative) prediction about a new value given
decaying average and decaying standard deviation; I think that this code
wants to achieve some kind of safety factor that is dependent on the
variance of the input sequence.

The second term

   seq->davg() * confidence_factor(seq->num())

is purely to get some estimates in absence of any samples. At a small
amount of samples this term is high, meaning that G1 is very uncertain
about them (and in fact, the standard deviation is 0) and the values are
more conservative. After five samples this term is not used any more (as
the factor is 1.0), i.e. G1 entirely relies on the other term.

> Also the introduction of the _sigma "confidence factor" into the
> calcuation performed by confidence_factor leads me to question the
> description of the value as being linearly along 2.0 to 1.0.

The confidence_factor() method is very poorly named, and the code
structure does not help.

If you substitute the confidence_factor() expression into the second
term in get_new_prediction, and you will see that it has the same shape
as the other term, e.g.

prediction = MAX2(
    seq->davg() + _sigma * seq->dsd(),
    seq->davg() + _sigma * seq->avg() * (5 - #samples) / 2.0)
)

Which reinforces the idea that the second term is just some random magic
term that gives very conservative "predictions" at startup due to
problems that with too few samples the stddev is not representative.

I think the reason why a multiplicative form has been chosen is that it
is easier to check the >= 5 special case, and you do not need to pass
the sequence into it then. I rewrote this in a new webrev.

Please have a look.

(And I have no idea how the "5" and "2.0" were derived. And actually,
looking at it in more detail, the accompanied documentation is just
wrong too. Let's keep them for now as they do not seem to be
outrageously wrong.
I think it is out of scope to discuss the prediction model in this CR.
It is imo a highly worthwhile problem to spend time on, but this CR is
intended as refactoring. There is a separate CR for that, JDK-8035560.)

> I think I might have a better understanding if there were some unit
> tests that demonstrated the expected results.

I added some unit tests, but I would prefer to not spend time on unit
testing a single addition and a multiplication more than that. Most
values used in this calculation are magic numbers, and you cannot check
them for validity anyway.

> ------------------------------------------------------------------------------  
> src/share/vm/gc/g1/g1Predictions.hpp 
>   31 class G1Predictions : public StackObj {
> 
> File is missing #include "memory/allocation.hpp".
> 
> I think this class ought to be a ValueObj rather than a StackObj.
> 
> ------------------------------------------------------------------------------  
> src/share/vm/gc/g1/g1Predictions.hpp 
>   43       return  1.0 + _sigma * ((double)(5 - samples))/2.0;
> 
> Cast to double is unnecessary.
> 
> ------------------------------------------------------------------------------ 
> src/share/vm/gc/g1/g1Predictions.hpp 
>   52   double get_new_prediction(TruncatedSeq const* seq) const{
> 
> Missing space after last "const".
> 
> ------------------------------------------------------------------------------

I fixed the last three issues together with Jon's concerns.

New webrevs at
http://cr.openjdk.java.net/~tschatzl/8137082/webrev.1/ (full)
http://cr.openjdk.java.net/~tschatzl/8137082/webrev.0_to_1/ (diff)

(Still running through jprt, but I do not expect an issue here)

Thanks,
  Thomas