JEP 271: Unified GC Logging

Bengt Rutisson bengt.rutisson at oracle.com
Fri Sep 25 09:29:34 UTC 2015


Hi Kirk,

On 2015-09-24 18:18, Kirk Pepperdine wrote:
> Hi Bengt,
>
>
>> On Sep 24, 2015, at 2:08 PM, Bengt Rutisson <bengt.rutisson at oracle.com> wrote:
>>
>>
>>
>> On 2015-09-24 04:24, mark.reinhold at oracle.com wrote:
>>> New JEP Candidate: http://openjdk.java.net/jeps/271
>>>
>>> - Mark
>> Hi everyone,
>>
>> I've started implementing a prototype for this JEP and thought I'd provide some samples of what I have done so far. Below are a few examples. There is more to think about and discuss than what I have shown below, but I hope this is a good start. I will post more updates as I move forward with the prototype.
>>
>> Running with -XX:+PrintGC in the current code will give logging like this:
>>
>> #14: [GC pause (G1 Evacuation Pause) (young) 106M->63M(128M), 0,0089484 secs]
>>
>> This line is actually printed in several steps. At the start of the GC the first part is written, then at the end of the GC the second part is written (actually the last part is written in two steps, first the heap usage information and then the timing information).
>>
>> The unified logging framework does not support logging partial lines (for good reasons), so I'm replacing this single line with two lines logged on the "gc" tag at "info" level. That is, in my prototype running with -Xlog:gc will give the following log lines:
>>
>> [0,743s][info][gc] GC#14 start [young, G1 Evacuation Pause]
>> [0,756s][info][gc] GC#14 end [111,1M->100,2M(128,0M)] [12,808 ms]
>>
>> The exact formatting of these lines is not set in stone yet, and I'm happy to hear some suggestions on how to format this nicely, but some important things I'd like to point out:
>>
>> - Splitting the single log line up into two lines makes sure that we don't get other logging interleaved inside the GC start/end log lines.
>> - The timestamp decorations allows for nicer tracking of GC times than just the duration and a single timestamp (as if you would run with PrintGCTimestamps today).
>> - I'm logging the duration of the GC in milliseconds rather than seconds. Using seconds sets the wrong expectations IMHO.
>> - The GC number (GC#14 in the example above) is logged using the prefix support in the unified logging framework. Thus it will be added to all GC logging automatically.
> Could you not simply wait and then log when things are completed? I think this partial logging is what lead to the confusing formats we have. Granted the confusion would be spread across a number of lines which may offer some relief but I’d suggest that it would be better just to wait. Any analysis I do rarely relies on a single log record. Losing one generally isn’t a big deal especially if it’s at the end of the file.

For us to be able to debug crashes it is sometimes very helpful to have 
the log records up until the point of the crash. Thus, I would not like 
to wait with the logging until the end of the GC. On the contrary, I 
think it is important to log any (complete) information as soon as possible.

>>
>> For those that really just want one line per GC, I've added a "summary" tag. Running with -Xlog:gc+summary will give one line per GC that looks like this:
>>
>> [0,768s][info][gc,summary] GC#14 [young, G1 Evacuation Pause] [112,5M->100,6M(128,0M)] [15,697 ms]
>>
>> This line is printed at the *end* of the the GC. It will be logged with one single log statement, so no other logging can be interleaved with it.
> Will this work with concurrent phases and if so, how would that affect concurrent phases.

Maybe I am misunderstanding the question, but my intent is to log an GC 
summary entry at the end of the complete concurrent cycle.

>>
>>
>> Running with PrintGCDetails in the current code give some more information than with just PrintGC. In particular G1 is pretty verbose when run with PrintGCDetails. A typical G1 log section for a GC when run with PrintGCDetails looks like this:
>>
>> #14: [GC pause (G1 Evacuation Pause) (young), 0,0098534 secs]
>>    [Parallel Time: 3,2 ms, GC Workers: 23]
>>       [GC Worker Start (ms): Min: 530,9, Avg: 531,3, Max: 531,6, Diff: 0,7]
>>       [Ext Root Scanning (ms): Min: 0,0, Avg: 0,1, Max: 0,4, Diff: 0,4, Sum: 1,8]
>>       [Update RS (ms): Min: 0,0, Avg: 0,2, Max: 1,8, Diff: 1,8, Sum: 4,0]
>>          [Processed Buffers: Min: 0, Avg: 1,0, Max: 3, Diff: 3, Sum: 24]
>>       [Scan RS (ms): Min: 0,0, Avg: 0,0, Max: 0,1, Diff: 0,1, Sum: 0,2]
>>       [Code Root Scanning (ms): Min: 0,0, Avg: 0,0, Max: 0,0, Diff: 0,0, Sum: 0,0]
>>       [Object Copy (ms): Min: 0,0, Avg: 1,1, Max: 1,3, Diff: 1,2, Sum: 26,2]
>>       [Termination (ms): Min: 0,0, Avg: 0,6, Max: 0,6, Diff: 0,6, Sum: 13,1]
>>          [Termination Attempts: Min: 1, Avg: 5,5, Max: 7, Diff: 6, Sum: 127]
>>       [GC Worker Other (ms): Min: 0,0, Avg: 0,0, Max: 0,2, Diff: 0,2, Sum: 1,1]
>>       [GC Worker Total (ms): Min: 1,7, Avg: 2,0, Max: 2,4, Diff: 0,8, Sum: 46,5]
>>       [GC Worker End (ms): Min: 533,3, Avg: 533,3, Max: 533,5, Diff: 0,2]
>>    [Code Root Fixup: 0,0 ms]
>>    [Code Root Purge: 0,0 ms]
>>    [Clear CT: 2,4 ms]
>>    [Other: 4,3 ms]
>>       [Choose CSet: 0,0 ms]
>>       [Ref Proc: 2,4 ms]
>>       [Ref Enq: 0,0 ms]
>>       [Redirty Cards: 1,7 ms]
>>       [Humongous Register: 0,0 ms]
>>       [Humongous Reclaim: 0,0 ms]
>>       [Free CSet: 0,1 ms]
>>    [Eden: 44,0M(44,0M)->0,0B(45,0M) Survivors: 4096,0K->4096,0K Heap: 107,6M(128,0M)->63,8M(128,0M)]
>> [Times: user=0,07 sys=0,00 real=0,01 secs]
>>
>>
>>
>> I've split this logging up a bit over a few different tags to give the user more control over the output. This can be seen by running with -Xlog:gc*=debug. Here's what that looks like in my prototype today:
>>
>>
>> [0,782s][info   ][gc        ] GC#14 start [young, G1 Evacuation Pause]
>> [0,799s][debug  ][gc,phases ] GC#14    Parallel Time: 5,4 ms
>> [0,799s][debug  ][gc,phases ] GC#14       GC Worker Start:          Min: 782,6, Avg: 783,0, Max: 783,3, Diff:  0,7
>> [0,799s][debug  ][gc,phases ] GC#14       Ext Root Scanning:        Min:  0,0, Avg:  0,1, Max:  0,4, Diff:  0,4, Sum:  2,8
>> [0,799s][debug  ][gc,phases ] GC#14       Update RS:                Min:  0,0, Avg:  0,3, Max:  2,6, Diff:  2,6, Sum:  7,0
>> [0,799s][debug  ][gc,phases ] GC#14          Processed Buffers:        Min: 0, Avg:  1,1, Max: 3, Diff: 3, Sum: 25
>> [0,799s][debug  ][gc,phases ] GC#14       Scan RS:                  Min:  0,0, Avg:  0,0, Max:  0,0, Diff:  0,0, Sum:  0,1
>> [0,799s][debug  ][gc,phases ] GC#14       Code Root Scanning:       Min:  0,0, Avg:  0,0, Max:  0,0, Diff:  0,0, Sum: 0,0
>> [0,799s][debug  ][gc,phases ] GC#14       Object Copy:              Min:  1,1, Avg:  3,3, Max:  3,6, Diff:  2,5, Sum: 75,7
>> [0,799s][debug  ][gc,phases ] GC#14 Termination:              Min:  0,0, Avg:  0,1, Max:  0,2, Diff: 0,2, Sum:  2,9
>> [0,799s][debug  ][gc,phases ] GC#14          Termination Attempts:     Min: 1, Avg:  1,0, Max: 1, Diff: 0, Sum: 23
>> [0,799s][debug  ][gc,phases ] GC#14       GC Worker Other:          Min:  0,0, Avg:  0,1, Max:  0,4, Diff:  0,4, Sum: 3,1
>> [0,799s][debug  ][gc,phases ] GC#14       GC Worker Total:          Min:  3,6, Avg:  4,0, Max:  4,3, Diff:  0,7, Sum: 91,7
>> [0,799s][debug  ][gc,phases ] GC#14       GC Worker End:            Min: 786,9, Avg: 787,0, Max: 787,2, Diff:  0,4
>> [0,799s][debug  ][gc,phases ] GC#14    Code Root Fixup: 0,0 ms
>> [0,799s][debug  ][gc,phases ] GC#14    Code Root Purge: 0,0 ms
>> [0,799s][debug  ][gc,phases ] GC#14    Clear CT: 2,7 ms
>> [0,799s][debug  ][gc,phases ] GC#14    Other: 8,5 ms
>> [0,800s][debug  ][gc,phases ] GC#14       Choose CSet: 0,0 ms
>> [0,800s][debug  ][gc,phases ] GC#14       Ref Proc: 2,4 ms
>> [0,800s][debug  ][gc,phases ] GC#14       Ref Enq: 0,0 ms
>> [0,800s][debug  ][gc,phases ] GC#14       Redirty Cards: 2,0 ms
>> [0,800s][debug  ][gc,phases ] GC#14       Humongous Register: 0,0 ms
>> [0,800s][debug  ][gc,phases ] GC#14       Humongous Reclaim: 0,0 ms
>> [0,800s][debug  ][gc,phases ] GC#14       Free CSet: 3,5 ms
>> [0,800s][debug  ][gc,heap   ] GC#14 Eden:     Used before:   17,0M Used after:    0,0B Capacity:   17,0M
>> [0,800s][debug  ][gc,heap   ] GC#14 Survivor: Used before: 3072,0K Used after: 3072,0K Capacity: 3072,0K
>> [0,800s][debug  ][gc,heap   ] GC#14 Heap:     Used before: 110,7M Used after:   94,3M Capacity:  128,0M
>> [0,800s][info   ][gc,summary] GC#14 [young, G1 Evacuation Pause] [110,7M->94,3M(128,0M)] [16,562 ms]
>> [0,800s][info   ][gc        ] GC#14 end [110,7M->94,3M(128,0M)] [16,562 ms]
>> [0,800s][debug  ][gc        ] GC#14 [Times: user=0,10 sys=0,00 real=0,02 secs]
> Comments on the tags, would it be better to use gc,g1,…  I’m thinking that as you move PS you’d end up with gc,ps,… or gc,parnew,… and so on.

Since we only every have one GC active at any give time I don't really 
see the need for GC specific tags such as g1 or parnew.

>
>>
>>
>> For example, if you are not interested in the timing of the different phases, but still want to know the eden/survivor usage you can run with -Xlog:gc,gc+heap=debug and you get this information for each GC:
>>
>> [0,771s][info   ][gc     ] GC#14 start [young, G1 Evacuation Pause]
>> [0,788s][debug  ][gc,heap] GC#14 Eden:     Used before: 15,0M Used after:    0,0B Capacity:   14,0M
>> [0,788s][debug  ][gc,heap] GC#14 Survivor: Used before: 3072,0K Used after: 3072,0K Capacity: 3072,0K
>> [0,788s][debug  ][gc,heap] GC#14 Heap:     Used before: 111,2M Used after:   97,1M Capacity:  128,0M
>> [0,788s][info   ][gc     ] GC#14 end [111,2M->97,1M(128,0M)] [16,651 ms]
>>
>>
>>
>> My intent is to remove all Print*GC* and Trace*GC* flags that we currently have and replace them with tags for the unified logging framework. As shown with the PrintGCDetails example above it is not sure that there will be a simple mapping from one flag to one tag. But all information that is currently logged will be logged in the new implementation too. Since the decorations include the tags it is pretty easy to find out what tags interesting information is logged on and configure your logging to show that information.
> So the concern here is what ends up at debug at what ends up at other levels. And this is where I really wish levels had been left out of UL spec. I would comment that the later log should be informational and not debug.

Yes, I saw the discussions in the UL spec email thread.

>
>> I'm currently thinking that we should probably keep the PrintGC and PrintGCDetails flags since they are very widely used. We could deprecate them and just have them map to some -Xlog configuration. My suggestion would be to map PrintGC to -Xlog:gc and PrintGCDetails to -Xlog:gc*=debug.
> sensible approach if you don’t want to make a radical change in how logs are formatted.

Good.

Thanks,
Bengt

>
> Regards,
> Kirk
>




More information about the hotspot-gc-dev mailing list