Shenandoah WB and tableswitch

Roland Westrelin rwestrel at redhat.com
Thu Dec 21 12:46:38 UTC 2017


> http://icedtea.classpath.org/hg/gc-bench/file/d04b4bbbc39f/src/main/java/org/openjdk/gcbench/wip/WriteBarrierTableSwitch.java

Both issues seem to boil down to the lack of profiling for tableswitch
in C2:

- the strip mining issue is caused by the scheduling of instructions in
  the inner loop when they should be in the outer loop . C2 gives all
  tableswitch branches the same frequency and so sees that exiting the
  loop is quite common which in turns means putting stuff at the end of
  the inner loop is cheaper than in the outer loop.

- the write barrier is not hoisted because it depends on a null check
  that is itself not hoisted. The null check is not hoisted because it's
  on a branch that C2 sees as not always taken. UseProfiledLoopPredicate
  is supposed to help with that by moving predicates on common branches
  out of loop except it needs profiling which it doesn't have for
  tableswitch.

Making C2 use tableswitch profiling leads to:

WriteBarrierTableSwitch.common      1000  avgt   15  1041.808 ± 17.481  ns/op
WriteBarrierTableSwitch.separate    1000  avgt   15  1104.196 ±  0.726  ns/op

with:

-XX:-TieredCompilation -XX:+UseShenandoahGC
 -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=passive
 -XX:+ShenandoahWriteBarrier -XX:+UseCountedLoopSafepoints
 -XX:LoopStripMiningIter=1000

For comparison current shenandoah tip with parallel gc on the same
machine:

WriteBarrierTableSwitch.common      1000  avgt   15  1067.409 ± 1.154  ns/op
WriteBarrierTableSwitch.separate    1000  avgt   15  1595.011 ± 1.448  ns/op

I'll work on cleaning up the patch so you can give it a try.

Roland.


More information about the shenandoah-dev mailing list