Knocking off two more vestiges of legacy switch
Brian Goetz
brian.goetz at oracle.com
Mon Sep 12 19:36:09 UTC 2022
The work on primitive patterns continues to yield fruit; it points us to
a principled way of dealing with _constant patterns_, both as nested
patterns, and to redefining constant case labels as simple patterns. It
also points us to a way to bring the missing three types into the realm
of switch (since now switch is usable at every type _but_ these): float,
double, and boolean. While I'm not in a hurry to prioritize this
immediately, I wanted to connect the dots to how primitive type patterns
lay the foundation for these two vestiges of legacy switch. (The
remaining vestige, not yet dealt with, is that legacy statement switches
are not exhaustive. We'd like a path back to uniformity there as well,
but this is likely a longer road.)
**Constant patterns.** In early explorations (e.g., "Pattern Matching
Semantics"), we struggled with the meaning of constant patterns,
specifically with conversions in the absence of a sharp type for the
match target. The exploration of that document treated boxing
conversions but not other conversions, which would have created a
gratuitously new conversion context. This was one of several reasons we
deferred constant patterns.
The current status is that constant case labels (e.g., `case 3`) are
permitted (a) only in the presence of a compatible operand type and (b)
are not patterns. This has led to some accidental complexity in
specifying switch, since we can have a mix of pattern and non-pattern
labels, and it means we can't use constants as nested patterns. (We've
also not yet integrated enum cases into the exhaustiveness analysis in
the presence of a sealed type that permits an enum type.) Ret-conning
all case labels as patterns seems attractive if we can make the
semantics clear, as not only does it bring more uniformity, but it means
we can use them as nested patterns, not just at the top level of the
switch. More composition.
The recent work on `instanceof` involving primitives offers a clear and
principled meaning to `0` as a pattern; given a constant `c` of type
`C`, treat
x matches c
as meaning
x matches C alpha && alpha eq c
where `eq` is a suitable comparison predicate for the type C (== for
integral types and enums, .equals() for String, and something irritating
for floating point.) This gives us a solid basis for interpreting
something like `case 3L`; we match if the target would match `long
alpha` and `alpha == 3L`. No new rules; all conversions are handled
through the type pattern for the static type of the constant in
question. Not coincidentally, the rules for primitive type patterns
support the implicit conversions allowed in today's switches on `short`,
`byte`, and `char`, which are allowed to use `int` labels, preserving
the meaning of existing code while we generalize what switch means.
The other attributes of patterns -- applicability, exhaustiveness, and
dominance -- are also easy:
- a constant pattern for `c : C` is applicable to S if a type pattern
for `C` is applicable to S.
- a type pattern for T dominates a constant pattern for `c : C` if the
type pattern for T dominates a type pattern for C.
- constant patterns are never exhaustive.
No new rules; just appeal to type patterns.
**Switch on float, double, and boolean.** Switches on floating point
were left out for the obvious reason -- it just isn't that useful, and
it would have introduced new complexity into the specification of
switch. Similarly, boolean was left out because we have "if"
statements. In the original world, where you could switch on only five
types, this was a sensible compromise. We later added in String and
enum types, which were sensible additions. But now we move into a
world where we can switch on every type _except_ float, double, and
boolean -- and this no long seems sensible. It still may not be
something people will use often, but a key driver of the redesign of
switch has been refactorability, and we currently don't have a story for
refactoring
record R(float f) { }
switch (r) {
case R(0f): ...
case R(1f): ...
}
to
switch (r) {
case R rr:
switch (rr.f()) {
case 0f: ...
case 1f: ...
}
}
because we don't have switches on float. By retconning constant case
labels as patterns, we don't have to define new semantics for switching
on these types or for constant labels of these types, we only have to
remove the restrictions about what types you can switch on.
**Denoting constant patterns.** One of the remaining questions is how
we denote constant patterns. This is a bit of a bikeshed, which we can
come back to when we're ready to move forward. For purposes of
exposition we'll use the constant literal here.
**Closing a compositional asymmetry.** In the "Patterns in the Java
Object Model" document, we called attention to a glaring problem in API
design, where it becomes nearly impossible to use the same sort of
composition for taking apart objects that we use for putting them
together. As an example, suppose we compose an `Optional<Shape>` as
follows:
Optional<Shape> os = Optional.of(Shape.redBall(1));
Here, we have static factories for both Optional and Shape, they don't
know about each other, but we can compose them just fine. Today, if we
want to reverse that -- ask whether an `Optional<Shape>` contains a red
ball of size 1, we have to do something awful and error prone:
Shape s = os.orElse(null);
boolean isRedUnitBall = s != null
&& s.isBall()
&& (s.color() == RED)
&& s.size() == 1;
if (isRedUnitBall) { ... }
These code snippets look nothing alike, making reversal harder and more
error-prone, and it gets worse the deeper you compose. With
destructuring patterns, this gets much better and more like the creation
expression:
if (os instanceof Optional.of(Shape.redBall(var size))
&& size == 1) { ... }
but that `&& size == 1` was a pesky asymmetry. With constant patterns
(modulo syntax), we can complete the transformation:
if (os instanceof Optional.of(Shape.redBall(1)) { ... }
and destructuring looks just like the aggregation.
**Bonus round: the last (?) vestige.** Currently, we allow statement
switches on legacy switch types (integers, their boxes, strings, and
enums) with all constant labels to be partial, and require all other
switches to be total. Patching this hole is harder, since there is lots
of legacy code today that depends on this partiality. There are a few
things we can do to pave the way forward here:
- Allow `default -> ;` in addition to `default -> { }`, since people
seem to have a hard time discovering the latter.
- Issue a warning when a legacy switch construct is not exhaustive.
This can start as a lint warning, move up to a regular warning over
time, then a mandatory (unsuppressable) warning. Maybe in a decade it
can become an error, but we can start paving the way sooner.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20220912/c91e016e/attachment.htm>
More information about the amber-spec-observers
mailing list