Shenandoah/Parallel and UTF8/serialization
Aleksey Shipilev
shade at redhat.com
Thu Feb 1 15:47:16 UTC 2018
Recently committed -XX:+UseSwitchProfiling was inspired by our investigations into Serial performance:
http://mail.openjdk.java.net/pipermail/shenandoah-dev/2018-February/004865.html
http://mail.openjdk.java.net/pipermail/shenandoah-dev/2017-December/004555.html
It does seem to improve Serial with sh/jdk10, on both Shenandoah and Parallel!
Parallel:
-XX:-UseSwitchProfiling: 22443.827 ± 88.328 ops/s
-XX:+UseSwitchProfiling: 23985.948 ± 147.074 ops/s
improv: +6.8%
Shenandoah:
-XX:-UseSwitchProfiling: 19675.852 ± 181.698 ops/s
-XX:+UseSwitchProfiling: 20539.850 ± 173.441 ops/s
improv: +4.4%
Note that Shenandoah improved a bit less, which means it is now slightly worse than baseline. Or,
looking at it more optimistically, the generic optimization improves the baseline a lot!
Our most targeted benchmark that captures the need for switch profiling seems to be mined out.
http://icedtea.classpath.org/hg/gc-bench/file/d04b4bbbc39f/src/main/java/org/openjdk/gcbench/wip/WriteBarrierTableSwitch.java
Benchmark (size) Mode Cnt Score Error Units
# Shenandoah, -UseSwitchProfiling
WriteBarrierTableSwitch.common 1000 avgt 5 908.240 ± 4.614 ns/op
WriteBarrierTableSwitch.separate 1000 avgt 5 1217.513 ± 18.511 ns/op
# Parallel, -UseSwitchProfiling
WriteBarrierTableSwitch.common 1000 avgt 5 856.878 ± 35.638 ns/op
WriteBarrierTableSwitch.separate 1000 avgt 5 1143.032 ± 28.779 ns/op
# Shenandoah, +UseSwitchProfiling
WriteBarrierTableSwitch.common 1000 avgt 5 490.165 ± 0.555 ns/op
WriteBarrierTableSwitch.separate 1000 avgt 5 490.200 ± 1.023 ns/op
# Parallel, +UseSwitchProfiling
WriteBarrierTableSwitch.common 1000 avgt 5 489.912 ± 20.953 ns/op
WriteBarrierTableSwitch.separate 1000 avgt 5 484.872 ± 0.169 ns/op
The UTF8 benchmark could still use some work to match Parallel performance:
http://icedtea.classpath.org/hg/gc-bench/file/d04b4bbbc39f/src/main/java/org/openjdk/gcbench/wip/WriteBarrierUTF8Scan.java
Shenandoah:
-XX:-UseSwitchProfiling: 3963.357 ± 10.242 ns/op
-XX:+UseSwitchProfiling: 681.130 ± 11.334 ns/op
Parallel:
-XX:-UseSwitchProfiling: 3084.094 ± 0.606 ns/op
-XX:+UseSwitchProfiling: 610.148 ± 3.897 ns/op
Thanks,
-Aleksey
More information about the shenandoah-dev
mailing list