Switch translation, part 2
Brian Goetz
brian.goetz at oracle.com
Mon Dec 11 21:37:57 UTC 2017
# Switch Translation, Part 2 -- type test patterns and guards
#### Maurizio Cimadamore and Brian Goetz
#### December 2017
This document examines possible translation of `switch` constructs
involving `case` labels that include type-test patterns, potentially
with guards. Part 3 will address translation of destructuring patterns,
nested patterns, and OR patterns.
## Type-test patterns
Type-test patterns are notable because their applicability predicate is
purely based on the type system, meaning that the compiler can directly
reason about it both statically (using flow analysis, optimizing away
dynamic type tests) and dynamically (with `instanceof`.) A switch
involving type-tests:
switch (x) {
case String s: ...
case Integer i: ...
case Long l: ...
}
can (among other strategies) be translated into a chain of `if-else`
using `instanceof` and casts:
if (x instanceof String) { String s = (String) x; ... }
else if (x instanceof Integer) { Integer i = (Integer) x; ... }
else if (x instanceof Long) { Long l = (Long) x; ... }
#### Guards
The `if-else` desugaring can also naturally handle guards:
switch (x) {
case String s
where (s.length() > 0): ...
case Integer i
where (i > 0): ...
case Long l
where (l > 0L): ...
}
can be translated to:
if (x instanceof String
&& ((String) x).length() > 0) { String s = (String) x; ... }
else if (x instanceof Integer
&& ((Integer) x) > 0) { Integer i = (Integer) x; ... }
else if (x instanceof Long
&& ((Long) x) > 0L) { Long l = (Long) x; ... }
#### Performance concerns
The translation to `if-else` chains is simple (for switches without
fallthrough), but is harder for the VM to optimize, because we've used a
more general control flow mechanism. If the target is an empty
`String`, which means we'd pass the first `instanceof` but fail the
guard, class-hierarchy analysis could tell us that it can't possibly be
an `Integer` or a `Long`, and so there's no need to perform those tests.
But generating code that takes advantage of this information is more
complex.
In the extreme case, where a switch consists entirely of type test
patterns for final classes, this could be performed as an O(1) operation
by hashing. And this is a common case involving switches over
alternatives in a sum (sealed) type. (We probably shouldn't rely on
finality at compile time, as this can change between compile and run
time, but we would like to take advantage of this at run time if we can.)
Finally, the straightforward static translation may miss opportunities
for optimization. For example:
switch (x) {
case Point p
where p.x > 0 && p.y > 0: A
case Point p
where p.x > 0 && p.y == 0: B
}
Here, not only would we potentially test the target twice to see if it
is a `Point`, but we then further extract the `x` component twice and
perform the `p.x > 0` test twice.
#### Optimization opportunities
The compiler can eliminate some redundant calculations through
straightforward techniques. The previous switch can be transformed to:
switch (x) {
case Point p:
if (((Point) p).x > 0 && ((Point) p).y > 0) { A }
else if (((Point) p).x > 0 && ((Point) p).y > 0) { B }
to eliminate the redundant `instanceof` (and could be further
transformed to eliminate the downstream redundant computations.)
#### Clause reordering
The above example was easy to transform because the two `case Point`
clauses were adjacent. But what if they are not? In some cases, it is
safe to reorder them. For types `T` and `U`, it is safe to reorder
`case T` and `case U` if the two types have no intersection; that there
can be no types that are subtypes of them both. This is true when `T`
and `U` are classes and neither extends the other, or when one is a
final class and the other is an interface that the class does not
implement.
The compiler could then reorder case clauses so that all the ones whose
first test is `case Point` are adjacent, and then coalesce them all into
a single arm of the `if-else` chain.
A possible spoiler here is fallthrough; if case A falls into case B,
then cases A and B have to be moved as a group. (This is another reason
to consider limiting fallthrough.)
#### Summary of if-else translation
While the if-else translation at first looks pretty bad, we are able to
extract a fair amount of redundancy through well-understood compiler
transformations. If an N-way switch has only M distinct types in it, in
most cases we can reduce the cost from _O(N)_ to _O(M)_. Sometimes _M
== N_, so this doesn't help, but sometimes _M << N_ (and sometimes `N`
is small, in which case _O(N)_ is fine.)
Reordering clauses involves some risk; specifically, that the class
hierarchy will change between compile and run time. It seems eminently
safe to reorder `String` and `Integer`, but more questionable to reorder
an arbitrary class `Foo` with `Runnable`, even if `Foo` doesn't
implement `Runnable` now, because it might easily be changed to do so
later. Ideally we'd like to perform class-hierarchy optimizations using
the runtime hierarchy, not the compile-time hierarchy.
## Type classifiers
The technique outlined in _Part 1_, where we lower the complex switch to
a dense `int` switch, and use an indy-based classifier to select an
index, is applicable here as well. First let's consider a switch
consisting only of unguarded type-test patterns (and optionally a
default clause.)
We'll start with an `indy` bootstrap whose static argument are `Class`
constants corresponding to each arm of the switch, whose dynamic
argument is the switch target, and whose return value is a case number
(or distinguished sentinels for "no match" and `null`.) We can easily
implement such a bootstrap with a linear search, but can also do better;
if some subset of the classes are `final`, we can choose between these
more quickly (such as via binary search on `hashCode()`, hash function,
or hash table), and we need perform only a single operation to test all
of those at once. Dynamic techniques (such as a building a hash map of
previously seen target types), which `indy` is well-suited to, can
asymptotically approach _O(1)_ even when the classes involved are not
final.
So we can lower:
switch (x) {
case T t: A
case U u: B
case V v: C
}
to
int y = indy[bootstrap=typeSwitch(T.class, U.class, V.class)](x)
switch (y) {
case 0: A
case 1: B
case 2: C
}
This has the advantages that the generated code is very similar to the
source code, we can (in some cases) get _O(1)_ dispatch performance, and
we can handle fallthrough with no additional complexity.
#### Guards
There are two approaches we could take to add support for guards into
the process; we could try to teach the bootstrap about guards (and would
have to pass locals that appear in guard expressions as additional
arguments to the classifier), or we could leave guards to the generated
bytecode. The latter seems far more attractive, but requires some
tweaks to the bootstrap arguments and to the shape of the generated code.
If the classifier says "you have matched case #3", but then we fail the
guard for #3, we want to go back into the classifier and start again at
#4. Additionally, we'd like for the classifier to use this information
("start over at #4") to optimize away unnecessary tests.
We add a second argument (where to start) to the classifier invocation
signature, and wrap the switch in a loop, lowering:
switch (x) {
case T t where (e1): A
case T t where (e2): B
case U u where (e3): C
}
into
int y = -1; // start at the top
while (true) {
y = indy[...](x, y)
switch (y) {
case 0: if (!e1) continue; A
case 1: if (!e2) continue; B
case 2: if (!e3) continue; C
}
break;
}
For cases where the same type test is repeated in consecutive positions
(at N and N+1), we can have the static compiler coalesce them as above,
or we could have the bootstrap maintain a table so that if you re-enter
the bootstrap where the previous answer was N, then it can immediately
return N+1. Similarly, if N and N+1 are known to be mutually exclusive
types (like `String` and `Integer`), on reentering the classifier with
N, we can skip right to N+2 since if we matched `String`, we cannot
match `Integer`. Lookup tables for such optimizations can be built at
link time.
#### Mixing constants and type tests
This approach also extends to tests that are a mix of constant patterns
and type-test patterns, such as:
switch (x) {
case "Foo": ...
case 0L: ...
case Integer i:
}
We can extend the bootstrap protocol to accept constants as well as
types, and it is a straightforward optimization to combine both type
matching and constant matching in a single pass.
More information about the amber-spec-observers
mailing list