RFR (M): JDK-8068394: Trace event for concurrent GC phases

sangheon sangheon.kim at oracle.com
Tue Dec 15 06:27:06 UTC 2015


Hi Bengt and all,

Here's a next webrev.
Bengt and I had a discussion on other thread and he and Stefan Karlsson 
suggested simpler design.

http://cr.openjdk.java.net/~sangheki/8068394/webrev.01
http://cr.openjdk.java.net/~sangheki/8068394/webrev.01_to_00

This webrev includes:
1. Moving concurrent phase measurement into CMSPhaseAccounting class. So 
we can check all concurrent phases from CMS.

2. Renamed a new flag introduced from webrev.00 for general use, 
_concurrent_phase_started.

3. Added helper class for concurrent phase measurement, 
GCConcPhaseTimer. And used this class for all concurrent phases.

4. Flag checking is moved from 
G1CollectedHeap::register_concurrent_cycle_end() to ConcurrentMark::abort().

5. Changed switch() to if else statement at gcTraceSend.

Many thanks to Bengt and Stefan Karlsson for the discussion.

Thanks,
Sangheon


On 12/03/2015 12:03 AM, Bengt Rutisson wrote:
>
> Hi Sangheon,
>
> This all sounds good. Looking forward to the next webrev.
>
> Thanks,
> Bengt
>
> On 2015-12-02 19:08, sangheon.kim wrote:
>> Hi Bengt,
>>
>>
>> On 12/02/2015 01:00 AM, Bengt Rutisson wrote:
>>>
>>> Hi Sangheon,
>>>
>>> Sorry again for the late reply. Comments inline.
>> No not at all.
>> Thank you for taking time for this review!
>>
>>>
>>> On 2015-11-24 19:48, sangheon.kim wrote:
>>>> Hi Bengt,
>>>>
>>>> Thank you so much for reviewing this patch!
>>>>
>>>> On 11/24/2015 05:12 AM, Bengt Rutisson wrote:
>>>>>
>>>>> Hi Sangheon,
>>>>>
>>>>> Sorry for the very late reply to this review request.
>>>>>
>>>>> On 2015-10-22 01:40, sangheon.kim wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> Can I get some reviews for this change of adding a trace event 
>>>>>> for concurrent phases on CMS and G1?
>>>>>>
>>>>>> Currently we only measure pause times for major phases.
>>>>>> But I want to add 'concurrent mark' and 'concurrent sweep' phases 
>>>>>> for CMS and 'concurrent mark' phase for G1.
>>>>>> To achieve this, I had to change ConcurrentGCTimer and related 
>>>>>> classes.
>>>>>>
>>>>>> This patch includes:
>>>>>> 1) A concurrent phase can be measured only from ConcurrentGCTimer 
>>>>>> and measuring an overlap between concurrent and pause is not 
>>>>>> allowed as currently we don't have that use case. And 
>>>>>> TimePartitions class(or related classes) will be simpler than an 
>>>>>> overlap allowed case.
>>>>>> 2) I removed PausePhase and ConcurrentPhase which are derived 
>>>>>> from GCPhase because I wanted to avoid heap allocation when 
>>>>>> adding to GrowableArray. Instead introduced 'type' member 
>>>>>> variable at GCPhase.
>>>>>>
>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8068394
>>>>>> Webrev: http://cr.openjdk.java.net/~sangheki/8068394/webrev.00/
>>>>>
>>>>> Nice work! It is great to get some timing information for the 
>>>>> concurrent phases.
>>>> Thanks!
>>>>
>>>>>
>>>>> A few questions/comments:
>>>>>
>>>>> CMS.
>>>>>
>>>>> You have added timing events for the concurrent phases mark and 
>>>>> sweep, but not for other concurrent phases (preclean, 
>>>>> abortable_preclean and reset_concurrent). I think that if you 
>>>>> moved your calls to _gc_timer_cm->register_gc_concurrent_start() 
>>>>> and _gc_timer_cm->register_gc_concurrent_end() into the 
>>>>> constructor and destructor of CMSPhaseAccounting you would 
>>>>> automatically get timing events for all concurrent phases.
>>>> I also considered about this idea but I was not clear whether it is 
>>>> good measurement especially for 'sweep'.
>>>> We are using CMSPhaseAccounting for 5 cases.
>>>> I think for 'mark' and 'preclean' they are okay. (there are 
>>>> non-product functions call before CMSPhaseAccounting. But they are 
>>>> okay).
>>>> For 'abortable-preclean' and 'reset', they are good to have that.
>>>> But when it comes to 'sweep', there are timer and 
>>>> CMSTokenSyncWithLocks related codes and I was not sure about these.
>>>> If you think they are also okay, I will change as you suggested.
>>>
>>> I think it is fine for the sweep part as well. If the 
>>> CMSPhaseAccounting instance is not measuring the correct part of the 
>>> work I think we should change it to measure what we want. That way 
>>> we improve the existing measurements too.
>>>
>>> But as you stated in your follow up email, I think the sweep code is 
>>> actually pretty much ok from this perspective.
>>>
>>> So, I would prefer to go with the CMSPhaseAccounting approach.
>> Okay, I already changed as you suggested. :)
>>
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>> G1.
>>>>>
>>>>> I think the use of _concurrent_marking_from_roots is unfortunate. 
>>>>> It would be much cleaner if 
>>>>> ConcurrentMark::register_mark_from_roots_phase_end() called 
>>>>> register_gc_concurrent_end() directly. 
>>>> I wanted to avoid introducing the new flag but the reason of it 
>>>> was, as you know, to handle 'abort'.
>>>> When 'abort' happens we do need to end all timers before calling 
>>>> 'GCTracer::report_gc_end()'.
>>>> And at this point we need a way to know whether concurrent timer is 
>>>> started or not.
>>>
>>> As I tried to describe I think you can extend the scope of the 
>>> timing to cover the part until the marking thread discovers that the 
>>> marking was aborted. Basically moving the timing in to 
>>> ConcurrentMarkThread::run_service().
>> Oh, I mis-understood your point. Sorry.
>> I will try your recommendation.
>>
>>>
>>>>
>>>>> I realize that this would change the timing for when a concurrent 
>>>>> marking is aborted. The whole time for a full GC (or even several 
>>>>> full GCs) would be included in the concurrent marking phase. But 
>>>>> from a code perspective that is what happens, so maybe that is the 
>>>>> correct time to report? Also, I think the logging is reported that 
>>>>> way so if we want to make it easy to match the timing events with 
>>>>> the logging we might want to use about the same scope for timing.
>>>>>
>>>>> Why do we only measure the concurrent mark phase for G1? 
>>>> This is good question. :)
>>>> I was comparing CMS and G1 for major concurrent phases.
>>>> And the candidates were 'concurrent mark/sweep (CMS)' and 
>>>> 'concurrent mark(G1)'.
>>>>
>>>>> Wouldn't it be good to measure all concurrent phases?
>>>> Okay, Stefan Karlsson also pointed to have them as well so I filed 
>>>> a separate CR for this.
>>>> https://bugs.openjdk.java.net/browse/JDK-8143951
>>>
>>> Ok. Is there a reason why we want to split this up into two changes? 
>>> Why not add all the concurrent timing events at once?
>>>
>>> I'm ok with splitting it up into two changes, but then maybe the 
>>> title for JDK-8068394 should be changed to indicate that it only 
>>> adds events for the marking phase. Also, if we go with the 
>>> CMSPhaseAccounting approach for CMS, you get events for all phases 
>>> there. So, it will be a bit odd to have all concurrent phases 
>>> handled for CMS but only concurrent mark for G1.
>> You are correct and I agree with you.
>> The only reason of split was due to limited time for FC.
>> But as we have enough time, let me close JDK-8143951 and include 
>> concurrent mark for G1 as well.
>>
>>>
>>>>
>>>>>
>>>>> gcTraceSend.cpp
>>>>>
>>>>>  389   void visit(GCPhase* phase) {
>>>>>  390     switch (phase->type()) {
>>>>>  391       case GCPhase::PausePhaseType:
>>>>>  392         assert(phase->level() < PhasesStack::PHASE_LEVELS, 
>>>>> "Need more event types for PausePhase");
>>>>>  393
>>>>>  394         switch (phase->level()) {
>>>>>  395           case 0: send_phase<EventGCPhasePause>(phase); break;
>>>>>  396           case 1: send_phase<EventGCPhasePauseLevel1>(phase); 
>>>>> break;
>>>>>  397           case 2: send_phase<EventGCPhasePauseLevel2>(phase); 
>>>>> break;
>>>>>  398           case 3: send_phase<EventGCPhasePauseLevel3>(phase); 
>>>>> break;
>>>>>  399           default: /* Ignore sending this phase */ break;
>>>>>  400         }
>>>>>  401         break;
>>>>>  402
>>>>>  403       case GCPhase::ConcurrentPhaseType:
>>>>>  404         assert(phase->level() < 1, "There's only one level 
>>>>> for ConcurrentPhase");
>>>>>  405
>>>>>  406         switch (phase->level()) {
>>>>>  407           case 0: send_phase<EventGCPhaseConcurrent>(phase); 
>>>>> break;
>>>>>  408           default: /* Ignore sending this phase */ break;
>>>>>  409         }
>>>>>  410         break;
>>>>>  411     }
>>>>>  412   }
>>>>>  413 };
>>>>>
>>>>> Since there are only two values for GCPhase::PhaseType it seems a 
>>>>> bit odd to use a switch statement. I think I would prefer to 
>>>>> factor out the code for the different types a bit too. So, maybe 
>>>>> something like:
>>>>>
>>>>> void visit(GCPhase* phase) {
>>>>>   if (phase->type() == GCPhase::PausePhaseType) {
>>>>>     visit_pause(phase);
>>>>>   } else {
>>>>>     assert(phase->type() == GCPhase::ConcurrentPhaseType, "");
>>>>>     visit_concurrent(phase);
>>>>>   }
>>>>> }
>>>> This seems better. I will fix.
>>>
>>> Good. Thanks!
>> :)
>>
>> I will post next webrev when I'm ready, including JDK-8143951 part as 
>> well.
>>
>> Thanks,
>> Sangheon
>>
>>
>>>
>>> Bengt
>>>
>>>>
>>>> Thanks,
>>>> Sangheon
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Bengt
>>>>>
>>>>>> Testing: JPRT
>>>>>>
>>>>>> Thanks,
>>>>>> Sangheon
>>>>>
>>>>
>>>
>>
>




More information about the hotspot-gc-dev mailing list