RFR (S): Percentile levels in -Xlog:gc+stats
Roman Kennke
rkennke at redhat.com
Mon Jan 9 10:48:08 UTC 2017
Sounds good in general.
Maybe instead of multiplying by 1000, measure more precisely?
Can't we have both max/SD and percentile stats? Or does it not make
sense at all to see max/sd?
Roman
Am Freitag, den 06.01.2017, 16:01 +0100 schrieb Aleksey Shipilev:
> Hi,
>
> The non-normality in phase times make average times in our gc+stats
> log
> confusing. For example, can you trust this line?
>
> Concurrent Marking Times = 18.18 s (avg = 142.02 ms) (num
> = 128, ...
>
> You can't, because there were two very different phases in workload
> lifetime:
> the initial burst of short concmarks when app is initializing, and
> then the
> steady state concmarks on stable LDS. To identify these cases in the
> stats, we
> are better off reporting the n-quantile levels to get the immediate
> "feel" of
> the distribution we are looking at.
>
> Webrev:
> http://cr.openjdk.java.net/~shade/shenandoah/stats-percentiles/webre
> v.01/
>
> This is a full line in patched version:
>
> Concurrent Marking Times = 18.18 s (avg = 142018 us)
> (num = 128, lvls (10% step, us) =
> 787, 858, 960, 2660, 4440, 4830, 5830, 7880, 9600, 2533512)
>
> Notice the distribution skew in levels.
>
> This is the line that is more trustable:
>
> Concurrent Marking Times = 15.16 s (avg = 63693 us)
> (num = 238, lvls (10% step, us) =
> 291, 524, 615, 772, 1000, 1600, 186000, 197000, 199000,
> 228671)
>
> And this looks very solid:
>
> Concurrent Marking Times = 1.80 s (avg = 179735 us)
> (num = 10, lvls (10% step, us) =
> 174000, 176000, 176000, 176000, 177000, 180000, 180000,
> 181000, ...
>
> Switching to microseconds instead of milliseconds helps to get more
> fidelity in
> sub-ms pause times.
>
> Testing: hotspot_gc_shenandoah, selected benchmarks
>
> Thanks,
> -Aleksey
>
More information about the shenandoah-dev
mailing list