Shenandoah WB and tableswitch
Roland Westrelin
rwestrel at redhat.com
Thu Dec 21 12:46:38 UTC 2017
> http://icedtea.classpath.org/hg/gc-bench/file/d04b4bbbc39f/src/main/java/org/openjdk/gcbench/wip/WriteBarrierTableSwitch.java
Both issues seem to boil down to the lack of profiling for tableswitch
in C2:
- the strip mining issue is caused by the scheduling of instructions in
the inner loop when they should be in the outer loop . C2 gives all
tableswitch branches the same frequency and so sees that exiting the
loop is quite common which in turns means putting stuff at the end of
the inner loop is cheaper than in the outer loop.
- the write barrier is not hoisted because it depends on a null check
that is itself not hoisted. The null check is not hoisted because it's
on a branch that C2 sees as not always taken. UseProfiledLoopPredicate
is supposed to help with that by moving predicates on common branches
out of loop except it needs profiling which it doesn't have for
tableswitch.
Making C2 use tableswitch profiling leads to:
WriteBarrierTableSwitch.common 1000 avgt 15 1041.808 ± 17.481 ns/op
WriteBarrierTableSwitch.separate 1000 avgt 15 1104.196 ± 0.726 ns/op
with:
-XX:-TieredCompilation -XX:+UseShenandoahGC
-XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=passive
-XX:+ShenandoahWriteBarrier -XX:+UseCountedLoopSafepoints
-XX:LoopStripMiningIter=1000
For comparison current shenandoah tip with parallel gc on the same
machine:
WriteBarrierTableSwitch.common 1000 avgt 15 1067.409 ± 1.154 ns/op
WriteBarrierTableSwitch.separate 1000 avgt 15 1595.011 ± 1.448 ns/op
I'll work on cleaning up the patch so you can give it a try.
Roland.
More information about the shenandoah-dev
mailing list