Knocking off two more vestiges of legacy switch

Mon Sep 12 22:29:03 UTC 2022

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Monday, September 12, 2022 9:36:09 PM
> Subject: Knocking off two more vestiges of legacy switch

> The work on primitive patterns continues to yield fruit; it points us to a
> principled way of dealing with _constant patterns_, both as nested patterns,
> and to redefining constant case labels as simple patterns. It also points us to
> a way to bring the missing three types into the realm of switch (since now
> switch is usable at every type _but_ these): float, double, and boolean. While
> I'm not in a hurry to prioritize this immediately, I wanted to connect the dots
> to how primitive type patterns lay the foundation for these two vestiges of
> legacy switch. (The remaining vestige, not yet dealt with, is that legacy
> statement switches are not exhaustive. We'd like a path back to uniformity
> there as well, but this is likely a longer road.)

> **Constant patterns.** In early explorations (e.g., "Pattern Matching
> Semantics"), we struggled with the meaning of constant patterns, specifically
> with conversions in the absence of a sharp type for the match target. The
> exploration of that document treated boxing conversions but not other
> conversions, which would have created a gratuitously new conversion context.
> This was one of several reasons we deferred constant patterns.

> The current status is that constant case labels (e.g., `case 3`) are permitted
> (a) only in the presence of a compatible operand type and (b) are not patterns.
> This has led to some accidental complexity in specifying switch, since we can
> have a mix of pattern and non-pattern labels, and it means we can't use
> constants as nested patterns. (We've also not yet integrated enum cases into
> the exhaustiveness analysis in the presence of a sealed type that permits an
> enum type.) Ret-conning all case labels as patterns seems attractive if we can
> make the semantics clear, as not only does it bring more uniformity, but it
> means we can use them as nested patterns, not just at the top level of the
> switch. More composition.

> The recent work on `instanceof` involving primitives offers a clear and
> principled meaning to `0` as a pattern; given a constant `c` of type `C`, treat

> x matches c

> as meaning

> x matches C alpha && alpha eq c

> where `eq` is a suitable comparison predicate for the type C (== for integral
> types and enums, .equals() for String, and something irritating for floating
> point.) This gives us a solid basis for interpreting something like `case 3L`;
> we match if the target would match `long alpha` and `alpha == 3L`. No new
> rules; all conversions are handled through the type pattern for the static type
> of the constant in question. Not coincidentally, the rules for primitive type
> patterns support the implicit conversions allowed in today's switches on
> `short`, `byte`, and `char`, which are allowed to use `int` labels, preserving
> the meaning of existing code while we generalize what switch means.

> The other attributes of patterns -- applicability, exhaustiveness, and dominance
> -- are also easy:

> - a constant pattern for `c : C` is applicable to S if a type pattern for `C` is
> applicable to S.
> - a type pattern for T dominates a constant pattern for `c : C` if the type
> pattern for T dominates a type pattern for C.
> - constant patterns are never exhaustive.

> No new rules; just appeal to type patterns.
It shows that the semantics you propose for the primitive type pattern is not the right one. 

Currently, a code like this does not compile 
byte b = ... 
switch(b) { 
case 200 -> .... 
} 

because 200 is not a short which is great because otherwise at runtime it will never be reached. 

But if we apply the rules above + your definition of the primitive pattern, the code above will happily compile because it is equivalent to 

byte b = ... 
switch(b) { 
case short s when s == 200 -> .... 
} 

Moreover, i think R(true) and R(false) should be exhaustive, it's not a big deal because you can rewrite it R(true) and R (or R(_)) but i think that R(true) and R(false) is more readable. 

> **Switch on float, double, and boolean.** Switches on floating point were left
> out for the obvious reason -- it just isn't that useful, and it would have
> introduced new complexity into the specification of switch. Similarly, boolean
> was left out because we have "if" statements. In the original world, where you
> could switch on only five types, this was a sensible compromise. We later added
> in String and enum types, which were sensible additions. But now we move into a
> world where we can switch on every type _except_ float, double, and boolean --
> and this no long seems sensible. It still may not be something people will use
> often, but a key driver of the redesign of switch has been refactorability, and
> we currently don't have a story for refactoring

> record R(float f) { }

> switch (r) {
> case R(0f): ...
> case R(1f): ...
> }

> to

> switch (r) {
> case R rr:
> switch (rr.f()) {
> case 0f: ...
> case 1f: ...
> }
> }

> because we don't have switches on float. By retconning constant case labels as
> patterns, we don't have to define new semantics for switching on these types or
> for constant labels of these types, we only have to remove the restrictions
> about what types you can switch on.

> **Denoting constant patterns.** One of the remaining questions is how we denote
> constant patterns. This is a bit of a bikeshed, which we can come back to when
> we're ready to move forward. For purposes of exposition we'll use the constant
> literal here.
This is what Haskell does, this is what Caml don't, at some point we will have to pick a side. 

> **Closing a compositional asymmetry.** In the "Patterns in the Java Object
> Model" document, we called attention to a glaring problem in API design, where
> it becomes nearly impossible to use the same sort of composition for taking
> apart objects that we use for putting them together. As an example, suppose we
> compose an `Optional<Shape>` as follows:

> Optional<Shape> os = Optional.of(Shape.redBall(1));

> Here, we have static factories for both Optional and Shape, they don't know
> about each other, but we can compose them just fine. Today, if we want to
> reverse that -- ask whether an `Optional<Shape>` contains a red ball of size 1,
> we have to do something awful and error prone:

> Shape s = os.orElse(null);
> boolean isRedUnitBall = s != null
> && s.isBall()
> && (s.color() == RED)
> && s.size() == 1;
> if (isRedUnitBall) { ... }

> These code snippets look nothing alike, making reversal harder and more
> error-prone, and it gets worse the deeper you compose. With destructuring
> patterns, this gets much better and more like the creation expression:

> if (os instanceof Optional.of(Shape.redBall(var size))
> && size == 1) { ... }

> but that `&& size == 1` was a pesky asymmetry. With constant patterns (modulo
> syntax), we can complete the transformation:

> if (os instanceof Optional.of(Shape.redBall(1)) { ... }

> and destructuring looks just like the aggregation.
I agree, it's quite sad that we have to support float and double but as you said composition is more important. 

> **Bonus round: the last (?) vestige.** Currently, we allow statement switches on
> legacy switch types (integers, their boxes, strings, and enums) with all
> constant labels to be partial, and require all other switches to be total.
> Patching this hole is harder, since there is lots of legacy code today that
> depends on this partiality. There are a few things we can do to pave the way
> forward here:

> - Allow `default -> ;` in addition to `default -> { }`, since people seem to
> have a hard time discovering the latter.
we should also fix that for lambdas, the fact that the lambda syntax and the case arrow syntax are not aligned currently ; `() -> throw ...`is not legal while `case ... -> throw ...` is, is something that trouble a lot of my student (i also introduce the switch syntax before the lambda, so the lambda seems less powerful ??). 

> - Issue a warning when a legacy switch construct is not exhaustive. This can
> start as a lint warning, move up to a regular warning over time, then a
> mandatory (unsuppressable) warning. Maybe in a decade it can become an error,
> but we can start paving the way sooner.

I agree with a switch warning if all the IDEs stop fixing the warning by adding a `default` when the type switched upon is sealed. 

Rémi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20220913/8776f6bc/attachment-0001.htm>