Request for review: 8007764: Wrong initialized value of max_gc_pause_sec for an instance of class AdaptiveSizePolicy

Tao Mao tao.mao at oracle.com
Mon Feb 18 18:49:11 UTC 2013


I'll be working on it. OK?
Tao

On 2/18/2013 12:53 AM, Bengt Rutisson wrote:
> On 2/15/13 8:04 PM, Jon Masamitsu wrote:
>>
>>
>> On 02/15/13 00:06, Bengt Rutisson wrote:
>>> On 2/14/13 8:35 PM, Jon Masamitsu wrote:
>>>> I dropped the alias.
>>>>
>>>> On 2/13/2013 11:09 PM, Bengt Rutisson wrote:
>>>>>
>>>>> Jon and Tao,
>>>>>
>>>>> On 2/13/13 6:41 PM, Jon Masamitsu wrote:
>>>>>>
>>>>>> On 2/13/2013 6:13 AM, Bengt Rutisson wrote:
>>>>>>> Hi Tao,
>>>>>>>
>>>>>>> On 2/12/13 11:24 PM, Tao Mao wrote:
>>>>>>> > Hi Bengt,
>>>>>>> > Above all, please confirm with your final say about the change.
>>>>>>>
>>>>>>> I think you are correct that this change won't have any effect. 
>>>>>>> But in that
>>>>>>> case, is it still worth doing?
>>>>>>
>>>>>> Given the choices of fixing it now or adding it to the list for a 
>>>>>> general clean up
>>>>>> later, I would prefer that it be fixed now.  It's a small change 
>>>>>> and Tao already
>>>>>> has a fix.  Considering the small additional effort, why not push 
>>>>>> it?
>>>>>
>>>>> I don't want to block this change, but I just don't know how to 
>>>>> review it. I am not convinced that this is an improvement. Since I 
>>>>> don't know what the intended use case for 
>>>>> GenCollectorPolicy::_size_policy is I don't know for sure that it 
>>>>> is better to use MaxGCPauseMillis than MaxGCMinorPauseMillis. The 
>>>>> initialize_size_policy() method has always (in a mercurial sense) 
>>>>> been using MaxGCMinorPauseMillis. Maybe someone thought that if 
>>>>> you inherit from GenCollectorPolicy you should be adapting the 
>>>>> young gen size?
>>>>
>>>> I get now why you're hesitant.  I have first hand knowledge about 
>>>> this and
>>>> the intent was that MaxGCPauseMillis be used. MaxGCPauseMillis is the
>>>> pause limit that should generally be used. MaxGCMinorPauseMillis is
>>>> a refinement on the general policy.   In the early days of GC ergo 
>>>> there
>>>> was lots of concern that it was not practical to set a pause time goal
>>>> for all the pauses.  MaxGCMinorPauseMillis was never advertise but
>>>> was there in case it turned out that it is indeed impractical to set a
>>>> goal for all pauses.
>>>
>>> Thanks for this explanation, Jon! Now I feel more comfortable with 
>>> this change. I'll respond on the list to say that I'm fine with the 
>>> change.
>>>
>>> You say that people were worried that it was not good enough to have 
>>> one pause target for all pauses. Do you think this is still a valid 
>>> concern, or should we file an RFE to remove the distinction between 
>>> MaxGCPauseMillis and MaxGCMinorPauseMillis?
>>
>> I think that MaxGCMinorPauseMillis should be removed.
>
> Great! :)
>
> I filed a bug (RFE) for it:
> https://jbs.oracle.com/bugs/browse/JDK-8008368
>
> Bengt
>
>>
>> Jon
>>>
>>> Bengt
>>>
>>>>
>>>> Jon
>>>>
>>>>>
>>>>> Anyway, I won't object to this being pushed. But I would not like 
>>>>> to be listed as reviewer since I really don't know how to verify 
>>>>> that this is an improvement and not a regression.
>>>>>
>>>>> Thanks,
>>>>> Bengt
>>>>>
>>>>>
>>>>>>
>>>>>> Jon
>>>>>>
>>>>>>>
>>>>>>> A short while ago, Erik sent out a suggestion for how to 
>>>>>>> refactor the
>>>>>>> CollectorPolicy code. Maybe we should just file a bug for 
>>>>>>> cleaning this up
>>>>>>> properly? In that bug we could add a comment about this 
>>>>>>> particular change.
>>>>>>>
>>>>>>> I'm not sure what the best approach is. I kind of prefer 
>>>>>>> removing dead code
>>>>>>> paths rather than trying to fix them. Also, I don't know how to 
>>>>>>> verify that the
>>>>>>> fix is correct if it is not being used.
>>>>>>>
>>>>>>> Anybody else have any opinions on this?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Bengt
>>>>>>>
>>>>>>> >
>>>>>>> > Thanks.
>>>>>>> > Tao
>>>>>>> >
>>>>>>> > On 2/12/2013 12:03 PM, Tao Mao wrote:
>>>>>>> >> Hi all,
>>>>>>> >>
>>>>>>> >> After some investigation of the code, I figured out that the 
>>>>>>> only opportunity
>>>>>>> >> where my change will affect something is in MarkSweepPolicy, 
>>>>>>> which is
>>>>>>> >> supposed to support sizing policy (if any) for UseSerialGC [1].
>>>>>>> >>
>>>>>>> >> But, I think, SerialGC hasn't implemented any sizing policy 
>>>>>>> associated with
>>>>>>> >> flags MaxGC*PauseMillis (I'm not sure about it tho/). So, we 
>>>>>>> can conclude
>>>>>>> >> that if a customer uses SerialGC, they wouldn't set 
>>>>>>> MaxGC*PauseMillis anyway.
>>>>>>> >>
>>>>>>> >> (Bengt and Jon, please help verify the above conclusion.)
>>>>>>> >>
>>>>>>> >> Therefore, the change will fix the initialization bug in a 
>>>>>>> safe way.
>>>>>>> >>
>>>>>>> >> It did take me a while to figure out the class and method 
>>>>>>> hierarchy. I share
>>>>>>> >> the diagram as a pic below. Hopefully, we can use it to build 
>>>>>>> up a better
>>>>>>> >> class design if possible.
>>>>>>> >>
>>>>>>> >> Sorry for the handwriting and not being able to draw a 
>>>>>>> fancier diagram in text .
>>>>>>> >>
>>>>>>> >> Thanks.
>>>>>>> >> Tao
>>>>>>> >>
>>>>>>> >> [1] in Universe::initialize_heap()
>>>>>>> >>
>>>>>>> >> if (UseParallelGC) {
>>>>>>> >> #ifndef SERIALGC
>>>>>>> >>     Universe::_collectedHeap = new ParallelScavengeHeap();
>>>>>>> >> #else  // SERIALGC
>>>>>>> >>     fatal("UseParallelGC not supported in this VM.");
>>>>>>> >> #endif // SERIALGC
>>>>>>> >>
>>>>>>> >>   } else if (UseG1GC) {
>>>>>>> >> #ifndef SERIALGC
>>>>>>> >>     G1CollectorPolicy* g1p = new G1CollectorPolicy();
>>>>>>> >>     G1CollectedHeap* g1h = new G1CollectedHeap(g1p);
>>>>>>> >>     Universe::_collectedHeap = g1h;
>>>>>>> >> #else  // SERIALGC
>>>>>>> >>     fatal("UseG1GC not supported in java kernel vm.");
>>>>>>> >> #endif // SERIALGC
>>>>>>> >>
>>>>>>> >>   } else {
>>>>>>> >>     GenCollectorPolicy *gc_policy;
>>>>>>> >>
>>>>>>> >>     if (UseSerialGC) {
>>>>>>> >>       gc_policy = new MarkSweepPolicy();
>>>>>>> >>     } else if (UseConcMarkSweepGC) {
>>>>>>> >> #ifndef SERIALGC
>>>>>>> >>       if (UseAdaptiveSizePolicy) {
>>>>>>> >>         gc_policy = new ASConcurrentMarkSweepPolicy();
>>>>>>> >>       } else {
>>>>>>> >>         gc_policy = new ConcurrentMarkSweepPolicy();
>>>>>>> >>       }
>>>>>>> >> #else   // SERIALGC
>>>>>>> >>     fatal("UseConcMarkSweepGC not supported in this VM.");
>>>>>>> >> #endif // SERIALGC
>>>>>>> >>     } else { // default old generation
>>>>>>> >>       gc_policy = new MarkSweepPolicy();
>>>>>>> >>     }
>>>>>>> >>
>>>>>>> >>     Universe::_collectedHeap = new GenCollectedHeap(gc_policy);
>>>>>>> >>   }
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On 2/10/2013 1:34 PM, Bengt Rutisson wrote:
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>> Hi Tao,
>>>>>>> >>>
>>>>>>> >>> Thanks for your answers. Some comments inline.
>>>>>>> >>>
>>>>>>> >>> On 2/9/13 12:44 AM, Tao Mao wrote:
>>>>>>> >>>> I'm trying to answer these questions. Please see inline. 
>>>>>>> Thanks.
>>>>>>> >>>> Tao
>>>>>>> >>>>
>>>>>>> >>>> On 2/8/13 8:17 AM, Bengt Rutisson wrote:
>>>>>>> >>>>>
>>>>>>> >>>>> Hi Tao,
>>>>>>> >>>>>
>>>>>>> >>>>> I think the change looks good and makes sense.
>>>>>>> >>>>>
>>>>>>> >>>>> But I have a hard time estimating the implications of this 
>>>>>>> change.
>>>>>>> >>>>>
>>>>>>> >>>>> It seems like MaxGCMinorPauseMillis is only used by CMS 
>>>>>>> and ParallelScavenge.
>>>>>>> >>>> You are right. Before the fix, four occurrences below. (The 
>>>>>>> one in
>>>>>>> >>>> collectorPolicy.cpp is suspected to be wrong)
>>>>>>> >>>>
>>>>>>> >>>> MaxGCMinorPauseMillis is introduced in different kinds of 
>>>>>>> classes for CMS
>>>>>>> >>>> and PS: ConcurrentMarkSweep*Policy* and ParalleScavenge*Heap*.
>>>>>>> >>>>
>>>>>>> >>>> $ grep -nr "MaxGCMinorPauseMillis"
>>>>>>> >>>> 
>>>>>>> src/src/share/vm/gc_implementation/concurrentMarkSweep/cmsCollectorPolicy.cpp:87:
>>>>>>> >>>> double max_gc_minor_pause_sec = ((double) 
>>>>>>> MaxGCMinorPauseMillis)/1000.0;
>>>>>>> >>>> 
>>>>>>> src/share/vm/gc_implementation/parallelScavenge/parallelScavengeHeap.cpp:156:
>>>>>>> >>>> double max_gc_minor_pause_sec = ((double) 
>>>>>>> MaxGCMinorPauseMillis)/1000.0;
>>>>>>> >>>> src/share/vm/memory/collectorPolicy.cpp:170: const double
>>>>>>> >>>> max_gc_minor_pause_sec = ((double) 
>>>>>>> MaxGCMinorPauseMillis)/1000.0;
>>>>>>> >>>> src/share/vm/runtime/globals.hpp:2043: product(uintx,
>>>>>>> >>>> MaxGCMinorPauseMillis, max_uintx,                          \
>>>>>>> >>>>> For both of those the default for MaxGCMinorPauseMillis is 
>>>>>>> the same as
>>>>>>> >>>>> MaxGCPauseMillis so it should be fine.
>>>>>>> >>>> How did you infer that the default value for 
>>>>>>> MaxGCMinorPauseMillis is same
>>>>>>> >>>> as MaxGCPauseMillis? I didn't see such an indication in the 
>>>>>>> code.
>>>>>>> >>>
>>>>>>> >>> I actually ran with both collectors using 
>>>>>>> -XX:+PrintFlagsFinal and looked at
>>>>>>> >>> the values printed. But you find it in the code too. Both 
>>>>>>> flags are declared
>>>>>>> >>> in globals.hpp as having max_uintx as default value:
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>   product(uintx, MaxGCPauseMillis, 
>>>>>>> max_uintx,                               \
>>>>>>> >>>           "Adaptive size policy maximum GC pause time goal 
>>>>>>> in msec, "       \
>>>>>>> >>>           "or (G1 Only) the max. GC time per MMU time 
>>>>>>> slice")               \
>>>>>>> >>>
>>>>>>> >>> product(uintx, MaxGCMinorPauseMillis, 
>>>>>>> max_uintx,                          \
>>>>>>> >>>           "Adaptive size policy maximum GC minor pause time 
>>>>>>> goal in msec")  \
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>>>
>>>>>>> >>>>> But what happens if a customer is running with
>>>>>>> >>>>> -XX:MaxGCMinorPauseMillis=1000 but not setting 
>>>>>>> MaxGCPauseMillis on the
>>>>>>> >>>>> command line? What are the implications of your change for 
>>>>>>> such runs? I
>>>>>>> >>>>> don't think it is very common. Just thinking.
>>>>>>> >>>> For ParallelScavenge, customers should distinguish the two 
>>>>>>> flags.
>>>>>>> >>>
>>>>>>> >>> Yes, they should, but my point was that if a customer was 
>>>>>>> just setting one
>>>>>>> >>> of the flags their application might behave differently now. 
>>>>>>> And I was
>>>>>>> >>> trying to understand in what way it would behave different. 
>>>>>>> But I actually
>>>>>>> >>> think that your change will not affect ParallelScavenge 
>>>>>>> anyway. See below.
>>>>>>> >>>
>>>>>>> >>>> For CMS, I wonder whether we have fully implemented 
>>>>>>> adaptive sizing policy.
>>>>>>> >>>> Jon, can you comment on this?
>>>>>>> >>>> In either case, customers should be able to tell apart.
>>>>>>> >>>
>>>>>>> >>> You are correct. Implementing adaptive sizing was never 
>>>>>>> completed for CMS
>>>>>>> >>> so, it is turned off. Let's ignore CMS for now.
>>>>>>> >>>
>>>>>>> >>>>>
>>>>>>> >>>>> Also, a more common case is probably to run with only 
>>>>>>> -XX:MaxGCPauseMillis
>>>>>>> >>>>> set on the command line. How will the adaptive sizing 
>>>>>>> behave differently
>>>>>>> >>>>> compared to before?
>>>>>>> >>>> The fix only changed the value in the routine
>>>>>>> >>>> GenCollectorPolicy::initialize_size_policy, which is 
>>>>>>> believed to be called
>>>>>>> >>>> within
>>>>>>> >>>> the routine of GenCollectedHeap::post_initialize(). I'm not 
>>>>>>> sure whether
>>>>>>> >>>> this routine can change sizing behaviors somewhere.
>>>>>>> >>>
>>>>>>> >>> post_initialize() is called from universe_post_init() for 
>>>>>>> all collectors.
>>>>>>> >>> But only the CollectedHeaps that inherit from 
>>>>>>> GenCollectedHeap will be
>>>>>>> >>> affected by your change.
>>>>>>> >>>
>>>>>>> >>> ParallelScavengeHeap does not inherit from GenCollectedHeap 
>>>>>>> and it
>>>>>>> >>> implements its own post_initialize(), so I think it is 
>>>>>>> unaffected by your
>>>>>>> >>> change.
>>>>>>> >>>
>>>>>>> >>> So, that's why I came to the conclusion that the code that 
>>>>>>> your are changing
>>>>>>> >>> is actually not being used by anyone. If this is correct I 
>>>>>>> think your change
>>>>>>> >>> is safe and for the better. But I also think that we could 
>>>>>>> possibly clean up
>>>>>>> >>> the code and get rid of some code paths. This should 
>>>>>>> probably be done as a
>>>>>>> >>> separate change though.
>>>>>>> >>>
>>>>>>> >>> Bengt
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>>>
>>>>>>> >>>>> Looking closer at the code it looks like CMS and 
>>>>>>> ParallelScavenge are
>>>>>>> >>>>> actually using CMSAdaptiveSizePolicy and PSAdaptiveSizePolicy
>>>>>>> >>>>> respectively. Both of these already pass the correct value 
>>>>>>> to the
>>>>>>> >>>>> constructor of the super class AdaptiveSizePolicy. So, 
>>>>>>> this bug is
>>>>>>> >>>>> currently benign. The code that you are changing is 
>>>>>>> actually not being
>>>>>>> >>>>> used. Is this a correct conclusion?
>>>>>>> >>>> See above. Whether the code I'm changing is being used 
>>>>>>> depends upon whether
>>>>>>> >>>> GenCollectedHeap::post_initialize() is called anywhere. But 
>>>>>>> I can't
>>>>>>> >>>> determine that right now.
>>>>>>> >>>>
>>>>>>> >>>> Other than that, the fix should correct the bug while 
>>>>>>> preserving everything
>>>>>>> >>>> else.
>>>>>>> >>>>>
>>>>>>> >>>>> Thanks,
>>>>>>> >>>>> Bengt
>>>>>>> >>>>>
>>>>>>> >>>>> On 2/8/13 3:42 AM, Tao Mao wrote:
>>>>>>> >>>>>> 8007764: Wrong initialized value of max_gc_pause_sec for 
>>>>>>> an instance of
>>>>>>> >>>>>> class AdaptiveSizePolicy
>>>>>>> >>>>>> https://jbs.oracle.com/bugs/browse/JDK-8007764
>>>>>>> >>>>>>
>>>>>>> >>>>>> webrev:
>>>>>>> >>>>>> http://cr.openjdk.java.net/~tamao/8007764/webrev.00/
>>>>>>> >>>>>>
>>>>>>> >>>>>> changeset:
>>>>>>> >>>>>>
>>>>>>> >>>>>
>>>>>>> >>>>
>>>>>>> >>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>




More information about the hotspot-gc-dev mailing list