RFR: JDK-8152952: Allow G1 phase logging to use individual number of threads

Bengt Rutisson bengt.rutisson at oracle.com
Wed Mar 30 14:11:16 UTC 2016


Hi Thomas,

Thanks for looking at this!

On 2016-03-30 11:15, Thomas Schatzl wrote:
> Hi Bengt,
>
> On Tue, 2016-03-29 at 14:03 +0200, Bengt Rutisson wrote:
>> Hi everyone,
>>
>> Could I have a couple of reviews for this change?
>>
>> http://cr.openjdk.java.net/~brutisso/8152952/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8152952
>>
>> Currently if you run with UseDynamicNumberOfGCThreads you can
>> potentially get a different number of worker threads each GC. There
>> are improvements coming where we want to select a different number of
>> worker threads for individual phases. The G1GCPhaseTimes and
>> WorkerDataArray structures need to support this.
>>
>> The proposed patch sets all slots in the WorkerDataArray to an
>> uninitialized value and then only print any values that have actually
>> been set for that phase.
>>
>> The patch also extends the log message about the number for worker
>> threads to also say how many it could potentially have used. And it
>> also fixes a missing space in the level 3 and level 4 indentation.
>>
>> After applying this patch and running with -Xlog:gc*,phases*=trace
>> you get output like:
>> [0,581s][info][gc,task ] GC(0) GC Workers: using 2 out of 23
>> [0,588s][info][gc,phases] GC(0)   Evacuate Collection Set: 5,0ms
>> [0,588s][trace][gc,phases] GC(0)       GC Worker Start (ms):     Min:
>> 580,9, Avg: 580,9, Max: 580,9, Diff:  0,0
> It would be useful to have the information about the number of threads
> used for every top-level WorkerDataArray. That might differ for every
> phase in the future. Now it does not matter, because at the moment
> every thread at least sets the time spent to zero (i.e. is forced to),
> but that will not be the case later.
>
> Not working on something is different to taking "zero" time for it.
>
> My suggestion is to add a ", Workers: X" column to the summary output, like
>
>> [0,588s][debug][gc,phases     ] GC(0)     Ext Root Scanning (ms):
>> Min:  1,7, Avg:  1,7, Max:  1,8, Diff:  0,0, Sum:  3,5
> Min:  1,7, Avg:  1,7, Max:  1,8, Diff:  0,0, Sum:  3,5, Workers: 2

I added that. The log now looks like this:

[0,613s][debug][gc,phases     ] GC(0)     Ext Root Scanning (ms):   
Min:  2,4, Avg:  2,5, Max:  2,7, Diff:  0,3, Sum:  5,0, Workers: 2

>
>> [0,589s][trace][gc,phases,task] GC(0)
>> 1,8  1,7   -   -   -   - -   -   -   -   -   -   -   -   -   -   -
>> -
>
>> [0,588s][trace][gc,phases,task] GC(0)
>> 580,9 580,9   -   -   - -   -   -   -   -   -   -   -   -   -   -   -
>>     
> - g1GCPhaseTimes.cpp:109: typo in ASSERT_PHASE_UNINITILAIZED, should be
> ASSERT_PHASE_UNINITIALIZED.

Fixed. Thanks for catching that! :)

>
> - ignore the following if wanted: in g1GCPhaseTimes.cpp:127 the if
> -clause is structured to have only the asserts in the if-part. I would
> prefer if the code that performs some useful work would be first, i.e.
> the condition reversed. As mentioned, ymmv.

Done.

>
> - I would kind of prefer if ADD_WORKER_KNOWN time were a method instead
> of a macro. In case of an uninitialized value that method could just
> return 0.0, which would be okay for this addition.

Done.

>
> - in WorkerDataArray::WDAPrinter::details, the "unknown value" imo does
> not need to be padded out to five spaces or so. I think the main
> motivation for this has been to show the values of the different phases
> of the evacuation phase in the same column.
>
> The details() method for the size_t values is not aligned at all.
>
> The main reason is waste of space. Not sure here, in doubt keep it.

Removed the padding for both variants.

>
> - I would prefer if the #include "memory/resourceArea.hpp" were put
> next to the other includes, of course guarded by the define.

I prefer to keep it closer to the tests. If you have strong opinions I 
can move it up, but since it is only test related I prefer to have it there.

>
> - in workerDataArray.hpp, at the definitions of sum() and average() it
> might be useful to mention what impact on sum/average uninitialized
> values have.

Added a comment.

>
> - I would prefer, if a phase has no data, that this would be detected
> automatically and the phase either not printed, or (preferably)
> indicated that it has not been executed. (I could also live with a
> solution where the programmer can choose what happens when there is no
> data. That would probably also remove the dependencies on a lot of
> other components, since as of right now there is quite a bit of
> checking for particular circumstances in G1GCPhaseTimes::print()).
>
> I do not really like the solution based on this change presented for
> JDK-8152428 that the programmer is responsible for explicitly
> specifying that a phase has not been executed. This will be forgotten,
> and only causes unnecessary failures potentially long after a change
> has been committed.
>
> The code to print has to iterate over all elements to sum/average them
> up anyway, so it already knows that there is no data for a particular
> phase.
>
> The reason why I prefer an indication that a phase has not been
> executed is that instead of a missing line that a missing line easy to
> overlook, while a "not executed" line is much more visible.

I've changed to do mostly what you suggested. A skipped phase now logs:

[6,339s][trace][gc,phases      ] GC(79)       Parallel Preserve CM Refs 
(ms):: skipped

But you can't control that by a configuration on the WorkerDataArray. On 
the other hand you can control it just like you did before, but guarding 
the call in the print() menthod. Like for example StringDeduplication does.

Updated webrev:
http://cr.openjdk.java.net/~brutisso/8152952/webrev.02/

Diff compared to last version:
http://cr.openjdk.java.net/~brutisso/8152952/webrev.01-02.diff/

Thanks,
Bengt

>
> Thanks,
>    Thomas
>




More information about the hotspot-gc-dev mailing list