Knocking off two more vestiges of legacy switch

Alan Malloy amalloy at google.com
Mon Sep 12 20:20:05 UTC 2022


It's nice to see this: I think it helps some with the previous discussion
on the list about "why do we want instanceof for primitives?" The point
isn't that we expect anyone to use instanceof for primitives often, but
that conversions for primitives in patterns is an important part of fixing
up the asymmetries switch is still stuck with.

On Mon, Sep 12, 2022 at 12:36 PM Brian Goetz <brian.goetz at oracle.com> wrote:

> The work on primitive patterns continues to yield fruit; it points us to a
> principled way of dealing with _constant patterns_, both as nested
> patterns, and to redefining constant case labels as simple patterns.  It
> also points us to a way to bring the missing three types into the realm of
> switch (since now switch is usable at every type _but_ these): float,
> double, and boolean.  While I'm not in a hurry to prioritize this
> immediately, I wanted to connect the dots to how primitive type patterns
> lay the foundation for these two vestiges of legacy switch.  (The remaining
> vestige, not yet dealt with, is that legacy statement switches are not
> exhaustive.  We'd like a path back to uniformity there as well, but this is
> likely a longer road.)
>
> **Constant patterns.**  In early explorations (e.g., "Pattern Matching
> Semantics"), we struggled with the meaning of constant patterns,
> specifically with conversions in the absence of a sharp type for the match
> target.  The exploration of that document treated boxing conversions but
> not other conversions, which would have created a gratuitously new
> conversion context.  This was one of several reasons we deferred constant
> patterns.
>
> The current status is that constant case labels (e.g., `case 3`) are
> permitted (a) only in the presence of a compatible operand type and (b) are
> not patterns.  This has led to some accidental complexity in specifying
> switch, since we can have a mix of pattern and non-pattern labels, and it
> means we can't use constants as nested patterns.  (We've also not yet
> integrated enum cases into the exhaustiveness analysis in the presence of a
> sealed type that permits an enum type.)  Ret-conning all case labels as
> patterns seems attractive if we can make the semantics clear, as not only
> does it bring more uniformity, but it means we can use them as nested
> patterns, not just at the top level of the switch.  More composition.
>
> The recent work on `instanceof` involving primitives offers a clear and
> principled meaning to `0` as a pattern; given a constant `c` of type `C`,
> treat
>
>     x matches c
>
> as meaning
>
>     x matches C alpha && alpha eq c
>
> where `eq` is a suitable comparison predicate for the type C (== for
> integral types and enums, .equals() for String, and something irritating
> for floating point.)  This gives us a solid basis for interpreting
> something like `case 3L`; we match if the target would match `long alpha`
> and `alpha == 3L`.  No new rules; all conversions are handled through the
> type pattern for the static type of the constant in question.  Not
> coincidentally, the rules for primitive type patterns support the implicit
> conversions allowed in today's switches on `short`, `byte`, and `char`,
> which are allowed to use `int` labels, preserving the meaning of existing
> code while we generalize what switch means.
>
> The other attributes of patterns -- applicability, exhaustiveness, and
> dominance -- are also easy:
>
>  - a constant pattern for `c : C` is applicable to S if a type pattern for
> `C` is applicable to S.
>  - a type pattern for T dominates a constant pattern for `c : C` if the
> type pattern for T dominates a type pattern for C.
>  - constant patterns are never exhaustive.
>
> No new rules; just appeal to type patterns.
>
> **Switch on float, double, and boolean.**  Switches on floating point were
> left out for the obvious reason -- it just isn't that useful, and it would
> have introduced new complexity into the specification of switch.
> Similarly, boolean was left out because we have "if" statements.  In the
> original world, where you could switch on only five types, this was a
> sensible compromise.  We later added in String and enum types, which were
> sensible additions.   But now we move into a world where we can switch on
> every type _except_ float, double, and boolean -- and this no long seems
> sensible.  It still may not be something people will use often, but a key
> driver of the redesign of switch has been refactorability, and we currently
> don't have a story for refactoring
>
>     record R(float f) { }
>
>     switch (r) {
>         case R(0f): ...
>         case R(1f): ...
>     }
>
> to
>
>     switch (r) {
>         case R rr:
>             switch (rr.f()) {
>                 case 0f: ...
>                 case 1f: ...
>             }
>     }
>
> because we don't have switches on float.  By retconning constant case
> labels as patterns, we don't have to define new semantics for switching on
> these types or for constant labels of these types, we only have to remove
> the restrictions about what types you can switch on.
>
> **Denoting constant patterns.**  One of the remaining questions is how we
> denote constant patterns.  This is a bit of a bikeshed, which we can come
> back to when we're ready to move forward.  For purposes of exposition we'll
> use the constant literal here.
>
> **Closing a compositional asymmetry.**  In the "Patterns in the Java
> Object Model" document, we called attention to a glaring problem in API
> design, where it becomes nearly impossible to use the same sort of
> composition for taking apart objects that we use for putting them
> together.  As an example, suppose we compose an `Optional<Shape>` as
> follows:
>
>     Optional<Shape> os = Optional.of(Shape.redBall(1));
>
> Here, we have static factories for both Optional and Shape, they don't
> know about each other, but we can compose them just fine.  Today, if we
> want to reverse that -- ask whether an `Optional<Shape>` contains a red
> ball of size 1, we have to do something awful and error prone:
>
>     Shape s = os.orElse(null);
>     boolean isRedUnitBall = s != null
>                            && s.isBall()
>                            && (s.color() == RED)
>                            && s.size() == 1;
>     if (isRedUnitBall) { ... }
>
> These code snippets look nothing alike, making reversal harder and more
> error-prone, and it gets worse the deeper you compose.  With destructuring
> patterns, this gets much better and more like the creation expression:
>
>     if (os instanceof Optional.of(Shape.redBall(var size))
>         && size == 1) { ... }
>
> but that `&& size == 1` was a pesky asymmetry.  With constant patterns
> (modulo syntax), we can complete the transformation:
>
>     if (os instanceof Optional.of(Shape.redBall(1)) { ... }
>
> and destructuring looks just like the aggregation.
>
> **Bonus round: the last (?) vestige.**  Currently, we allow statement
> switches on legacy switch types (integers, their boxes, strings, and enums)
> with all constant labels to be partial, and require all other switches to
> be total.  Patching this hole is harder, since there is lots of legacy code
> today that depends on this partiality.  There are a few things we can do to
> pave the way forward here:
>
>  - Allow `default -> ;` in addition to `default -> { }`, since people seem
> to have a hard time discovering the latter.
>  - Issue a warning when a legacy switch construct is not exhaustive.  This
> can start as a lint warning, move up to a regular warning over time, then a
> mandatory (unsuppressable) warning.  Maybe in a decade it can become an
> error, but we can start paving the way sooner.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20220912/ba3d6ebf/attachment-0001.htm>


More information about the amber-spec-observers mailing list