Request for review: 8008546: G1: Wrong G1CONFIDENCEPERCENT results in GUARANTEE(VARIANCE() > -1.0) FAILED

John Cuthbertson john.cuthbertson at oracle.com
Thu Feb 21 18:50:08 UTC 2013


Hi Bengt,

Thanks for the stack trace. It helps. The counter is obviously 
_marking_step_diffs_ms and the confidence value feeds into this via 
G1CollectorPolicy::get_new_prediction(). I ran jbb2005 with the flags 
you suggest and still no failure. I'm still a little confused though. If 
we look at TruncatedSeq:add() and AbsSeq::variance():

void TruncatedSeq::add(double val) {
   AbsSeq::add(val);

   // get the oldest value in the sequence...
   double old_val = _sequence[_next];
   // ...remove it from the sum and sum of squares
   _sum -= old_val;
   _sum_of_squares -= old_val * old_val;

   // ...and update them with the new value
   _sum += val;
   _sum_of_squares += val * val;

   // now replace the old value with the new one
   _sequence[_next] = val;
   _next = (_next + 1) % _length;

   // only increase it if the buffer is not full
   if (_num < _length)
     ++_num;

   guarantee( variance() > -1.0, "variance should be >= 0" );
}

the guarantee can only trip if the result of variance() is -1.0 or less. 
Correct? What value is being returned by variance()?

Now look at AbsSeq::variance():

double AbsSeq::variance() const {
   if (_num <= 1)
     return 0.0;

   double x_bar = avg();
   double result = _sum_of_squares / total() - x_bar * x_bar;
   if (result < 0.0) {
     // due to loss-of-precision errors, the variance might be negative
     // by a small bit

     //    guarantee(-0.1 < result && result < 0.0,
     //        "if variance is negative, it should be very small");
     result = 0.0;
   }
   return result;
}

How can this return a result -1.0 or less? If result is negative then 
the return value should be 0.0 or a small magnitude negative; not -1.0 
or less. If the value of result is close enough to 0.0 to make the 
"result < 0.0" evaluate to false, it shouldn't be able to result in 
"variance() > -1.0" evaluate to false.

Vladimir: I still think your change is good but I'm not sure it's a true 
fix for the problem.

Regards,

JohnC

On 2/21/2013 1:56 AM, Bengt Rutisson wrote:
>
> John,
>
> The bug report lists a guarantee in nuberSeq.cpp. You need a 
> concurrent cycle for the issue to reproduce.
>
> I can repro it with this command line using SPECjbb2005:
>
> java -XX:+UseG1GC -XX:G1ConfidencePercent=300 -Xmx256m 
> -XX:InitiatingHeapOccupancyPercent=1 -XX:+PrintGC -cp 
> jbb.jar:check.jar spec.jbb.JBBmain
>
> Here is the stack trace from the hs_err file i get:
>
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error 
> (/Users/brutisso/repos/hs-gc-test-deprecated/src/share/vm/utilities/numberSeq.cpp:166), 
> pid=34072, tid=16131
> #  guarantee(variance() > -1.0) failed: variance should be >= 0
> #
> # JRE version: Java(TM) SE Runtime Environment (8.0-b68) (build 
> 1.8.0-ea-b68)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b15-internal-jvmg 
> mixed mode bsd-amd64 compressed oops)
> # Failed to write core dump. Core dumps have been disabled. To enable 
> core dumping, try "ulimit -c unlimited" before starting Java again
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.sun.com/bugreport/crash.jsp
> #
>
> ---------------  T H R E A D  ---------------
>
> Current thread (0x00007feda9038000):  ConcurrentGCThread [stack: 
> 0x0000000129587000,0x0000000129687000] [id=16131]
>
> Stack: [0x0000000129587000,0x0000000129687000], 
> sp=0x0000000129686190,  free space=1020k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, 
> C=native code)
> V  [libjvm.dylib+0xd48e80]  VMError::report(outputStream*)+0x1262
> V  [libjvm.dylib+0xd4a57b]  VMError::report_and_die()+0x88f
> V  [libjvm.dylib+0x5328a7]  report_vm_error(char const*, int, char 
> const*, char const*)+0xc7
> V  [libjvm.dylib+0xac6dd2]  TruncatedSeq::add(double)+0x170
> V  [libjvm.dylib+0x4d389a]  CMTask::do_marking_step(double, bool, 
> bool)+0x15ba
> V  [libjvm.dylib+0x4d9d5a]  CMConcurrentMarkingTask::work(unsigned 
> int)+0x192
> V  [libjvm.dylib+0xda8c8e]  GangWorker::loop()+0x666
> V  [libjvm.dylib+0xda7fc0]  GangWorker::run()+0x3a
> V  [libjvm.dylib+0xafef34]  java_start(Thread*)+0x1d4
> C  [libsystem_c.dylib+0x14742]  _pthread_start+0x147
> C  [libsystem_c.dylib+0x1181]  thread_start+0xd
>
>
> Bengt
>
>
> On 2/20/13 9:13 PM, John Cuthbertson wrote:
>> Hi Vladimir,
>>
>> The change looks good to me. But I have a question: which counter was 
>> tripping the guarantee? I just ran jvm98 with a confidence percent 
>> value of 200 and 300 and the guarantee didn't fire for me.
>>
>> I kind of agree with Bengt that moving this kind of error checking 
>> earlier would be better. But that can be done as a separate CR.
>>
>> Thanks,
>>
>> JohnC
>>
>> On 2/20/2013 7:27 AM, vladimir kempik wrote:
>>> Hi Bengt,
>>>
>>> Thanks for looking at this!
>>>
>>> Here is an updated webrev based on your feedback:
>>>
>>> http://cr.openjdk.java.net/~mcherkas/vladimir/8008546/webrev.01/
>>>
>>> I applied what you suggested.
>>>
>>> Thanks,
>>> Vladimir.
>>>
>>> On 20.02.2013 17:54, Bengt Rutisson wrote:
>>>>
>>>> Hi Vladimir,
>>>>
>>>> This looks very similar to how we treat G1ReservePercent, so I 
>>>> think it looks good. An alternative would have been to check this 
>>>> earlier in the initialization phase and update the flag 
>>>> G1ConfidencePercent so that PrintFlagsFinal would have printed the 
>>>> actual value. But for consistency I think it looks good this way.
>>>>
>>>> I think you can change G1ConfidencePercent to be an uintx instead 
>>>> of intx (in g1_globals.hpp). In that case you don't need the second 
>>>> if statment since it can't be less than 0. It is also more 
>>>> consistent with G1ReservePercent which is an uintx.
>>>>
>>>> Thanks,
>>>> Bengt
>>>>
>>>> On 2/20/13 2:31 PM, vladimir kempik wrote:
>>>>> Hi all,
>>>>>
>>>>> Could I have a couple of reviews for this change?
>>>>>
>>>>> http://cr.openjdk.java.net/~mcherkas/vladimir/8008546/webrev.00/
>>>>>
>>>>> Input value for G1CONFIDENCEPERCENT wasn't checked before using. 
>>>>> This results in crash sometimes if -XX:+UseG1GC 
>>>>> -XX:G1ConfidencePercent=200 flags are used. Now checking the value 
>>>>> same way as it was done for G1ReservePercent. Increase to 0 if 
>>>>> negative, decrease to 100 if more than 100.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>
>>>
>>
>




More information about the hotspot-gc-dev mailing list