RFE: -Xlog:gc to show pause/concurrent phases duration and heap occupancy

Per Liden per.liden at oracle.com
Tue Dec 12 09:22:55 UTC 2017


Hi Aleksey,

The logging code in ZGC is in the middle of a transitions from the "old" 
style to a "new" style. The transitions isn't complete yet, which is why 
it might not appear optimal at this time.

In the "old" style, we had pretty much what you propose, which is also 
sort of similar to what G1 and CMS is doing. However, we believe we have 
is a great opportunity to re-think and improve things, which is why we 
started this transition. Once completed, I think it will serve both 
users and GC devs better.

In the "new" style, we're aiming for the following:

* -Xlog:gc - Enables basic logging. Basically one single line per GC, 
non-verbose, giving you only the most important information. This is 
something you would typically enable when you just want to keep an eye 
on what the GC does, to get a feeling for the heap size, pause times, 
etc. It's not intended for anyone who wants to tune or dig deeper. 
Rather, something you might use once all tuning is completed, but you 
still want some information. A user that isn't very familiar with (or 
care) how ZGC works internally should still find this information 
useful. The line we print today is missing some time-related 
information. This line will in the future look something like this:

GC(0) Garbage Collection (Warmup) 1646M(10%)->596M(4%), 1.0ms/1.1ms/1.2ms

Where the last three numbers represent the pause times (mark start, mark 
end, reloc start) during that GC cycle. We might also want to add some 
notion of the total GC cycle time here.


* -Xlog:gc* - Enable verbose logging. This is the go to logging option 
when you want to tune, debug or dig deeper. Obviously much more verbose, 
making use of appropriate log tags so that the output can be 
refined/filtered as needed. Interpreting information here typically 
requires some level of understanding of ZGC. Many of the log tag 
combinations have additional information on the debug/trace level, 
normally only useful for GC devs debugging the ZGC itself. The output 
you see today in this mode is also in the middle of our transition, and 
hence incomplete. We plan to move much of the information currently 
printed in the gc+stats table in under more specific tags and let them 
be printed once per GC.


While there's some value in having different GCs do logging in a similar 
way, I see an even greater value in providing really good and useful 
logging. In both ZGC and Shenandoah we have the rare opportunity to make 
things better, without having to deal with old baggage or old ways of 
doing things, which we're trying to embrace.


cheers,
Per

On 2017-12-11 17:52, Aleksey Shipilev wrote:
> Current -Xlog:gc log with ZGC is rather opaque:
>
> [19.964s][info][gc] GC(0) Garbage Collection (Warmup) 1646M(10%)->596M(4%)
> [31.690s][info][gc] GC(1) Garbage Collection (Warmup) 3304M(20%)->384M(2%)
> [38.666s][info][gc] GC(2) Garbage Collection (Warmup) 5004M(31%)->1428M(9%)
> [46.724s][info][gc] GC(3) Garbage Collection (Allocation Rate) 9198M(56%)->8132M(50%)
> [49.010s][info][gc] GC(4) Garbage Collection (Allocation Rate) 8596M(52%)->9786M(60%)
> [51.373s][info][gc] GC(5) Garbage Collection (Allocation Rate) 10348M(63%)->9800M(60%)
> [53.698s][info][gc] GC(6) Garbage Collection (Allocation Rate) 10002M(61%)->10724M(65%)
> [55.434s][info][gc] GC(7) Garbage Collection (Allocation Rate) 10778M(66%)->7956M(49%)
> [57.446s][info][gc] GC(8) Garbage Collection (Allocation Rate) 8420M(51%)->10896M(67%)
>
> The real pause work hides under -Xlog:gc+phases:
>
> [197.362s][info][gc,phases] GC(108) Pause Mark Start 0.697ms
> [198.119s][info][gc,phases] GC(108) Concurrent Mark 757.226ms
> [198.135s][info][gc,phases] GC(108) Pause Mark End 0.802ms
> [198.136s][info][gc,phases] GC(108) Concurrent References Processing 1.461ms
> [198.149s][info][gc,phases] GC(108) Concurrent Reset Relocation Set 12.285ms
> [198.154s][info][gc,phases] GC(108) Concurrent Destroy Detached Pages 0.001ms
> [198.158s][info][gc,phases] GC(108) Concurrent Select Relocation Set 3.239ms
> [198.332s][info][gc,phases] GC(108) Concurrent Prepare Relocation Set 173.931ms
> [198.333s][info][gc,phases] GC(108) Pause Relocate Start 1.010ms
> [198.819s][info][gc,phases] GC(108) Concurrent Relocate 485.961ms
>
> Can/should we redo this to match what e.g. Shenandoah is doing: phases to be at "gc" tag, and
> concurrent phases to report how heap had changed while phases were running? This should help new
> users to understand what collector is doing better, and also allow better comparison against other
> OpenJDK collectors, that report heap occupancy and pauses under "gc" tag.
>
> So ZGC output would look like:
>
> [info][gc] GC(108) Pause Mark Start 0.697ms
> [info][gc] GC(108) Concurrent Mark xxxxM->xxxxM 757.226ms
> [info][gc] GC(108) Pause Mark End 0.802ms
> [info][gc] GC(108) Concurrent References Processing xxxxM->xxxxM 1.461ms
> [info][gc] GC(108) Concurrent Prepare Relocate xxxxM->xxxxM 12.285ms
> [info][gc] GC(108) Pause Relocate Start 1.010ms
> [info][gc] GC(108) Concurrent Relocate xxxxM->xxxxM 485.961ms
>
> "Prepare Relocate" combines all Relocation Set activities -- those can be under gc+phases if anyone
> wants it.
>
> Thanks,
> -Aleksey
>


More information about the zgc-dev mailing list