RFR(M): 8200303: C2 should leverage profiling for lookupswitch/tableswitch

Roland Westrelin rwestrel at redhat.com
Tue Mar 27 14:35:16 UTC 2018


http://cr.openjdk.java.net/~roland/8200303/webrev.00/

Over in the Shenandoah project, Aleksey found that:

1) a counted loop with a switch:

for (...) {
  switch (..) {
    ...
    default:
      throw SomeException();
  }
}

where some cases break out of the loop would not perform as well when
loop strip mining is enabled, even if the cases that exit the loop are
never taken in practice.

Because C2 gives all branches out of a JumpNode the same probability,
exiting the loop has a non null probability and GCM computes (wrongly)
that scheduling the loop strip mining book keeping logic in the loop is
cheaper than out of the loop.

2) Shenandoah write barriers in some of the cases should be hoisted but
are not because C2 can't tell that only a single case of the switch is
ever hit.

http://mail.openjdk.java.net/pipermail/shenandoah-dev/2017-December/004535.html

In the Shenandoah repo, we have a change that makes C2 leverage
profiling for switch. Experiments showed that 1) and 2) above are fixed
and that some common benchmarks run with parallel gc benefit as well
(~+7% on Serial):

http://mail.openjdk.java.net/pipermail/shenandoah-dev/2018-February/004886.html

The patch I'm proposing here is based on the patch we've been using for
a couple months in Shenandoah:

- it fixes profile collection in c1 for lookupswitch/tableswitch

- it sets profiling information on IfNodes and JumpNodes emitted from
  lookupswitch/tableswitch and propagate it after matching so GCM can
  take advantage of it

- it takes advantage of profiling to find never taken cases and trim
  down the cases (or ranges as they're called in the code). A never
  taken range can now cause an uncommon trap.

and also has some improvements:

- if some ranges are a lot more common than others, it might pay off to
  check for them one after the other before going to the binary
  search. The patch has some logic to evaluate the number of steps in
  the binary search and determine whether checking for the most common
  case upfront would pay off (from profile data)

- the binary search doesn't always keep the tree balanced but instead
  picks a mid point that split frequencies in half

Roland.


More information about the hotspot-compiler-dev mailing list