RFR (M): JDK-8068394: Trace event for concurrent GC phases

Tue Nov 24 13:12:19 UTC 2015

Hi Sangheon,

Sorry for the very late reply to this review request.

On 2015-10-22 01:40, sangheon.kim wrote:
> Hi all,
>
> Can I get some reviews for this change of adding a trace event for 
> concurrent phases on CMS and G1?
>
> Currently we only measure pause times for major phases.
> But I want to add 'concurrent mark' and 'concurrent sweep' phases for 
> CMS and 'concurrent mark' phase for G1.
> To achieve this, I had to change ConcurrentGCTimer and related classes.
>
> This patch includes:
> 1) A concurrent phase can be measured only from ConcurrentGCTimer and 
> measuring an overlap between concurrent and pause is not allowed as 
> currently we don't have that use case. And TimePartitions class(or 
> related classes) will be simpler than an overlap allowed case.
> 2) I removed PausePhase and ConcurrentPhase which are derived from 
> GCPhase because I wanted to avoid heap allocation when adding to 
> GrowableArray. Instead introduced 'type' member variable at GCPhase.
>
> CR: https://bugs.openjdk.java.net/browse/JDK-8068394
> Webrev: http://cr.openjdk.java.net/~sangheki/8068394/webrev.00/

Nice work! It is great to get some timing information for the concurrent 
phases.

A few questions/comments:

CMS.

You have added timing events for the concurrent phases mark and sweep, 
but not for other concurrent phases (preclean, abortable_preclean and 
reset_concurrent). I think that if you moved your calls to 
_gc_timer_cm->register_gc_concurrent_start() and 
_gc_timer_cm->register_gc_concurrent_end() into the constructor and 
destructor of CMSPhaseAccounting you would automatically get timing 
events for all concurrent phases.

G1.

I think the use of _concurrent_marking_from_roots is unfortunate. It 
would be much cleaner if 
ConcurrentMark::register_mark_from_roots_phase_end() called 
register_gc_concurrent_end() directly. I realize that this would change 
the timing for when a concurrent marking is aborted. The whole time for 
a full GC (or even several full GCs) would be included in the concurrent 
marking phase. But from a code perspective that is what happens, so 
maybe that is the correct time to report? Also, I think the logging is 
reported that way so if we want to make it easy to match the timing 
events with the logging we might want to use about the same scope for 
timing.

Why do we only measure the concurrent mark phase for G1? Wouldn't it be 
good to measure all concurrent phases?

gcTraceSend.cpp

  389   void visit(GCPhase* phase) {
  390     switch (phase->type()) {
  391       case GCPhase::PausePhaseType:
  392         assert(phase->level() < PhasesStack::PHASE_LEVELS, "Need 
more event types for PausePhase");
  393
  394         switch (phase->level()) {
  395           case 0: send_phase<EventGCPhasePause>(phase); break;
  396           case 1: send_phase<EventGCPhasePauseLevel1>(phase); break;
  397           case 2: send_phase<EventGCPhasePauseLevel2>(phase); break;
  398           case 3: send_phase<EventGCPhasePauseLevel3>(phase); break;
  399           default: /* Ignore sending this phase */ break;
  400         }
  401         break;
  402
  403       case GCPhase::ConcurrentPhaseType:
  404         assert(phase->level() < 1, "There's only one level for 
ConcurrentPhase");
  405
  406         switch (phase->level()) {
  407           case 0: send_phase<EventGCPhaseConcurrent>(phase); break;
  408           default: /* Ignore sending this phase */ break;
  409         }
  410         break;
  411     }
  412   }
  413 };

Since there are only two values for GCPhase::PhaseType it seems a bit 
odd to use a switch statement. I think I would prefer to factor out the 
code for the different types a bit too. So, maybe something like:

void visit(GCPhase* phase) {
   if (phase->type() == GCPhase::PausePhaseType) {
     visit_pause(phase);
   } else {
     assert(phase->type() == GCPhase::ConcurrentPhaseType, "");
     visit_concurrent(phase);
   }
}

Thanks,
Bengt

> Testing: JPRT
>
> Thanks,
> Sangheon