RFR (M): JDK-8068394: Trace event for concurrent GC phases
Bengt Rutisson
bengt.rutisson at oracle.com
Wed Dec 2 09:00:08 UTC 2015
Hi Sangheon,
Sorry again for the late reply. Comments inline.
On 2015-11-24 19:48, sangheon.kim wrote:
> Hi Bengt,
>
> Thank you so much for reviewing this patch!
>
> On 11/24/2015 05:12 AM, Bengt Rutisson wrote:
>>
>> Hi Sangheon,
>>
>> Sorry for the very late reply to this review request.
>>
>> On 2015-10-22 01:40, sangheon.kim wrote:
>>> Hi all,
>>>
>>> Can I get some reviews for this change of adding a trace event for
>>> concurrent phases on CMS and G1?
>>>
>>> Currently we only measure pause times for major phases.
>>> But I want to add 'concurrent mark' and 'concurrent sweep' phases
>>> for CMS and 'concurrent mark' phase for G1.
>>> To achieve this, I had to change ConcurrentGCTimer and related classes.
>>>
>>> This patch includes:
>>> 1) A concurrent phase can be measured only from ConcurrentGCTimer
>>> and measuring an overlap between concurrent and pause is not allowed
>>> as currently we don't have that use case. And TimePartitions
>>> class(or related classes) will be simpler than an overlap allowed case.
>>> 2) I removed PausePhase and ConcurrentPhase which are derived from
>>> GCPhase because I wanted to avoid heap allocation when adding to
>>> GrowableArray. Instead introduced 'type' member variable at GCPhase.
>>>
>>> CR: https://bugs.openjdk.java.net/browse/JDK-8068394
>>> Webrev: http://cr.openjdk.java.net/~sangheki/8068394/webrev.00/
>>
>> Nice work! It is great to get some timing information for the
>> concurrent phases.
> Thanks!
>
>>
>> A few questions/comments:
>>
>> CMS.
>>
>> You have added timing events for the concurrent phases mark and
>> sweep, but not for other concurrent phases (preclean,
>> abortable_preclean and reset_concurrent). I think that if you moved
>> your calls to _gc_timer_cm->register_gc_concurrent_start() and
>> _gc_timer_cm->register_gc_concurrent_end() into the constructor and
>> destructor of CMSPhaseAccounting you would automatically get timing
>> events for all concurrent phases.
> I also considered about this idea but I was not clear whether it is
> good measurement especially for 'sweep'.
> We are using CMSPhaseAccounting for 5 cases.
> I think for 'mark' and 'preclean' they are okay. (there are
> non-product functions call before CMSPhaseAccounting. But they are okay).
> For 'abortable-preclean' and 'reset', they are good to have that.
> But when it comes to 'sweep', there are timer and
> CMSTokenSyncWithLocks related codes and I was not sure about these.
> If you think they are also okay, I will change as you suggested.
I think it is fine for the sweep part as well. If the CMSPhaseAccounting
instance is not measuring the correct part of the work I think we should
change it to measure what we want. That way we improve the existing
measurements too.
But as you stated in your follow up email, I think the sweep code is
actually pretty much ok from this perspective.
So, I would prefer to go with the CMSPhaseAccounting approach.
>
>>
>>
>> G1.
>>
>> I think the use of _concurrent_marking_from_roots is unfortunate. It
>> would be much cleaner if
>> ConcurrentMark::register_mark_from_roots_phase_end() called
>> register_gc_concurrent_end() directly.
> I wanted to avoid introducing the new flag but the reason of it was,
> as you know, to handle 'abort'.
> When 'abort' happens we do need to end all timers before calling
> 'GCTracer::report_gc_end()'.
> And at this point we need a way to know whether concurrent timer is
> started or not.
As I tried to describe I think you can extend the scope of the timing to
cover the part until the marking thread discovers that the marking was
aborted. Basically moving the timing in to
ConcurrentMarkThread::run_service().
>
>> I realize that this would change the timing for when a concurrent
>> marking is aborted. The whole time for a full GC (or even several
>> full GCs) would be included in the concurrent marking phase. But from
>> a code perspective that is what happens, so maybe that is the correct
>> time to report? Also, I think the logging is reported that way so if
>> we want to make it easy to match the timing events with the logging
>> we might want to use about the same scope for timing.
>>
>> Why do we only measure the concurrent mark phase for G1?
> This is good question. :)
> I was comparing CMS and G1 for major concurrent phases.
> And the candidates were 'concurrent mark/sweep (CMS)' and 'concurrent
> mark(G1)'.
>
>> Wouldn't it be good to measure all concurrent phases?
> Okay, Stefan Karlsson also pointed to have them as well so I filed a
> separate CR for this.
> https://bugs.openjdk.java.net/browse/JDK-8143951
Ok. Is there a reason why we want to split this up into two changes? Why
not add all the concurrent timing events at once?
I'm ok with splitting it up into two changes, but then maybe the title
for JDK-8068394 should be changed to indicate that it only adds events
for the marking phase. Also, if we go with the CMSPhaseAccounting
approach for CMS, you get events for all phases there. So, it will be a
bit odd to have all concurrent phases handled for CMS but only
concurrent mark for G1.
>
>>
>> gcTraceSend.cpp
>>
>> 389 void visit(GCPhase* phase) {
>> 390 switch (phase->type()) {
>> 391 case GCPhase::PausePhaseType:
>> 392 assert(phase->level() < PhasesStack::PHASE_LEVELS, "Need
>> more event types for PausePhase");
>> 393
>> 394 switch (phase->level()) {
>> 395 case 0: send_phase<EventGCPhasePause>(phase); break;
>> 396 case 1: send_phase<EventGCPhasePauseLevel1>(phase);
>> break;
>> 397 case 2: send_phase<EventGCPhasePauseLevel2>(phase);
>> break;
>> 398 case 3: send_phase<EventGCPhasePauseLevel3>(phase);
>> break;
>> 399 default: /* Ignore sending this phase */ break;
>> 400 }
>> 401 break;
>> 402
>> 403 case GCPhase::ConcurrentPhaseType:
>> 404 assert(phase->level() < 1, "There's only one level for
>> ConcurrentPhase");
>> 405
>> 406 switch (phase->level()) {
>> 407 case 0: send_phase<EventGCPhaseConcurrent>(phase); break;
>> 408 default: /* Ignore sending this phase */ break;
>> 409 }
>> 410 break;
>> 411 }
>> 412 }
>> 413 };
>>
>> Since there are only two values for GCPhase::PhaseType it seems a bit
>> odd to use a switch statement. I think I would prefer to factor out
>> the code for the different types a bit too. So, maybe something like:
>>
>> void visit(GCPhase* phase) {
>> if (phase->type() == GCPhase::PausePhaseType) {
>> visit_pause(phase);
>> } else {
>> assert(phase->type() == GCPhase::ConcurrentPhaseType, "");
>> visit_concurrent(phase);
>> }
>> }
> This seems better. I will fix.
Good. Thanks!
Bengt
>
> Thanks,
> Sangheon
>
>
>>
>> Thanks,
>> Bengt
>>
>>> Testing: JPRT
>>>
>>> Thanks,
>>> Sangheon
>>
>
More information about the hotspot-gc-dev
mailing list