RFR (S): Percentile levels in -Xlog:gc+stats
Aleksey Shipilev
shade at redhat.com
Fri Jan 6 15:01:14 UTC 2017
Hi,
The non-normality in phase times make average times in our gc+stats log
confusing. For example, can you trust this line?
Concurrent Marking Times = 18.18 s (avg = 142.02 ms) (num = 128, ...
You can't, because there were two very different phases in workload lifetime:
the initial burst of short concmarks when app is initializing, and then the
steady state concmarks on stable LDS. To identify these cases in the stats, we
are better off reporting the n-quantile levels to get the immediate "feel" of
the distribution we are looking at.
Webrev:
http://cr.openjdk.java.net/~shade/shenandoah/stats-percentiles/webrev.01/
This is a full line in patched version:
Concurrent Marking Times = 18.18 s (avg = 142018 us)
(num = 128, lvls (10% step, us) =
787, 858, 960, 2660, 4440, 4830, 5830, 7880, 9600, 2533512)
Notice the distribution skew in levels.
This is the line that is more trustable:
Concurrent Marking Times = 15.16 s (avg = 63693 us)
(num = 238, lvls (10% step, us) =
291, 524, 615, 772, 1000, 1600, 186000, 197000, 199000, 228671)
And this looks very solid:
Concurrent Marking Times = 1.80 s (avg = 179735 us)
(num = 10, lvls (10% step, us) =
174000, 176000, 176000, 176000, 177000, 180000, 180000, 181000, ...
Switching to microseconds instead of milliseconds helps to get more fidelity in
sub-ms pause times.
Testing: hotspot_gc_shenandoah, selected benchmarks
Thanks,
-Aleksey
More information about the shenandoah-dev
mailing list