RFR(XL): 8203469: Faster safepoints

Aleksey Shipilev shade at redhat.com
Tue Jan 15 20:34:08 UTC 2019


On 1/15/19 12:33 PM, Robbin Ehn wrote:
> On 2019-01-15 11:48, Andrew Haley wrote:
>> On 1/15/19 10:39 AM, Robbin Ehn wrote:
>>> - On a 16 strand machine synchronization and un-synchronization/starting is at
>>>     least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and
>>>     starting ~400->~100us.
>>
>> Thanks. Could you share that benchmark? It'd be interesting to try it on
>> other architectures.
>>
> 
> Those numbers are calculate from linux perf data, I couldn't remember what I run.
> But apparently it is also specJVM2008 serial with benchmark thread 16.
> The reason for serial that is that Aleksey tested a couple of previews with
> Shenandoah and indicated that serial was a good test.
> (Aleksey, if you have time please test again)

Hey. I indeed remember testing the early version of the patch, discovered some Serial throughput
hit, and Robbin had fixed that then. Unfortunately, SPECjvm does not run cleanly with current
jdk/jdk. We have Serial-like workload driven by JMH. Re-running it with the current jdk/jdk first to
get the bearings on possible throughput loss (i7-7820X, 16 threads, -Xms4g -Xmx4g):

 ZGC, baseline:               19779 ± 79 ops/s
 ZGC, fast safepoints:        19858 ± 89 ops/s ; no loss!
 Shenandoah, baseline:        19929 ± 57 ops/s
 Shenandoah, fast safepoints: 19932 ± 49 ops/s ; no loss!

Excellent, both collectors indicate no throughput loss.

Now, there are two ways to measure the safepointing overhead. First, Shenandoah measures both "net"
pause time (VM operation time, basically), and "gross" pause time (which includes both entering and
leaving the safepoint, as well as VM operation time). With longer Serial run to get more GC cycles
going:

 # Shenandoah, baseline
 [info][gc,stats] Total Pauses (G) = 0.47 s (a = 1139 us) (n = 416)
                                            (lvls, us = 203, 381, 426, 2324, 4705)
 [info][gc,stats] Total Pauses (N) = 0.09 s (a =  216 us) (n = 416)
                                            (lvls, us =  27, 182, 203,  223, 1258)

 # Shenandoah, fast safepoints
 [info][gc,stats] Total Pauses (G) = 0.21 s (a = 500 us) (n = 414)
                                            (lvls, us = 191, 398, 418,  443, 8992)
 [info][gc,stats] Total Pauses (N) = 0.10 s (a = 229 us) (n = 414)
                                            (lvls, us =  26, 184, 217,  244,  2327)

Ta-da! Gross pause times dropped twice. Which means that before we spent ~3/4 ms in safepoint infra,
and now we only spend ~1/4 ms.

The second way is to parse safepoint logs. JMH has -prof safepoints that does it automatically. (It
also prints percentiles and event counts, omitted here for brevity):

Benchmark                               Mode   Cnt       Score    Error  Units

# ZGC, baseline
Serial.test                            thrpt    10   19710.077 ± 96.073  ops/s
Serial.test:·safepoints.pause          thrpt  1260    1310.022              ms
Serial.test:·safepoints.pause.avg      thrpt             1.040              ms
Serial.test:·safepoints.ttsp           thrpt  1260    1053.717              ms
Serial.test:·safepoints.ttsp.avg       thrpt             0.836              ms

# ZGC, fast safepoints
Serial.test                            thrpt    10   19904.687 ± 39.679  ops/s
Serial.test:·safepoints.pause          thrpt  1275     291.520              ms
Serial.test:·safepoints.pause.avg      thrpt             0.229              ms
Serial.test:·safepoints.ttsp           thrpt  1275      92.065              ms
Serial.test:·safepoints.ttsp.avg       thrpt             0.072              ms ; 11.6x lower!

# Shenandoah, baseline
Serial.test                            thrpt   10   19874.508 ± 47.175  ops/s
Serial.test:·safepoints.pause          thrpt  389     425.723              ms
Serial.test:·safepoints.pause.avg      thrpt            1.094              ms
Serial.test:·safepoints.ttsp           thrpt  389     312.413              ms
Serial.test:·safepoints.ttsp.avg       thrpt            0.803              ms

# Shenandoah, fast safepoints
Serial.test                            thrpt   10   19918.094 ± 48.058  ops/s
Serial.test:·safepoints.pause          thrpt  387     159.440              ms
Serial.test:·safepoints.pause.avg      thrpt            0.412              ms
Serial.test:·safepoints.ttsp           thrpt  387      41.941              ms
Serial.test:·safepoints.ttsp.avg       thrpt            0.108              ms ; 7.4x lower!

So, TTSP (time to safepoint) had improved majestically. The rest of "gross" pause time measured by
Shenandoah is quite probably the safepoint leaving latency.

Also did a spot build checks: the patch builds in fastdebug on aarch64, s390x, x86_32.

-Aleksey



More information about the hotspot-dev mailing list