Rehabilitating switch -- a scorecard

Wed May 19 12:21:55 UTC 2021

> De: "Brian Goetz" <brian.goetz at oracle.com>
> À: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Envoyé: Mercredi 19 Mai 2021 13:12:43
> Objet: Re: Rehabilitating switch -- a scorecard

> So, here's another aspect of switches rehabilitation, this time in terms of
> syntactic rewrites. By way of analogy with lambdas, there's a sequence of

> x -> e // parens elided in unary lambda

> is-shorthand-for

> (x) -> e // types elided

> is-shorthand-for

> (var x) -> e // explicit request for inference

> is-shorthand-for

> (<actual type> x) -> e // explicit types

> That is, there is a canonical (lowest) form, and the various shorthands form a
> chain of embeddings. The chain shape reduces cognitive load on the user,
> because instead of thinking "there are seven forms of lambda", they can instead
> think there is single canonical form, with progressive options for leaving
> things out / mushing things together.

> We get more of a funnel with the syntax of switch:

> case L, J, K -> X;

> is-shorthand-for

> case L, J, K: yield X; // expression switch, X is an expression
> case L, J, K: X; // expression switch, X is a block
> case L, J, K: X; break; // statement switch

> and

> case L, J, K: X;

> is-shorthand-for

> case L:
> case J:
> case K:
> X;
We also have the inverse problem, rehabilitating lambda syntax to be aligned with the switch syntax. 
The only discempancy i'm aware of is 
case Foo -> throw ... 
being allowed while 
(Foo foo) -> throw .... 
is not. 

BTW, this year i've presented the switch expression before the lambda, so a student ask me why Java does not allow colon in lambda, 
like this 
(a, b) : 
X 
break; 

instead of 
(a, b) -> { 
X 
} 

I answered explaining that a lambda was a function not a block of instructions, but I still feel a diffuse guilt about the reuse of -> inside the switch. 

Rémi 

> On 5/17/2021 5:36 PM, Brian Goetz wrote:

>> This is a good time to look at the progress we've made with switch. When we
>> started looking at extending switch to support pattern matching (four years
>> ago!) we identified a lot of challenges deriving from switch's C legacy, some
>> of which is summarized here:

>> [ http://cr.openjdk.java.net/~briangoetz/amber/switch-rehab.html |
>> http://cr.openjdk.java.net/~briangoetz/amber/switch-rehab.html ]

>> We had two primary driving goals for improving switch: switches as expressions,
>> and switches with patterns as labels. In turn, these pushed on a number of
>> other uncomfortable aspects of switch: fall through, totality, scoping, and
>> null handling.

>> Initially, we were unsure we would be able to rehabilitate switch to support
>> these new requirements without being forever bogged down by the mistakes of the
>> past. Bit by bit, we have chipped away at the negative aspects of switch, while
>> respecting the existing code that depends on those aspects. I think where we've
>> landed is, in many ways, better than we could have initially hoped for.

>> Throughout this exercise, there were periodic calls for "just toss it and invent
>> something new" (which we sometimes called "snitch", shorthand for "new
>> switch"*), and no shortage of people's attempts to design their ideal switch
>> construct. We resisted this line of attack, because we believed having two
>> similar-but-different constructs living side by side would be more annoying
>> (and confusing) to users than a rehabilitated, albeit more complex, construct.

>> The first round of improvements came with expression switches. This was the easy
>> batch, because it didn't materially change the set of questions we could ask
>> with switch, just the form in which we asked the question. This brought the
>> following improvements:

>> - Switches as expressions. Many existing switch statements are in reality
>> modeling expressions, in a more roundabout and less safe way. Expressing it
>> directly is simpler and less error-prone.
>> - Checked totality. The compiler enforces that a switch expression is exhaustive
>> (because, expressions must be total). In the case of enum switches, a switch
>> that covers all the cases needs no default clause, and the compiler inserts an
>> extra case to catch novel values and throw (ICCE) on them. (Eventually the same
>> will be true for switches on sealed classes as well.)
>> - A fallthrough-free option. Switches now give us a choice between two styles of
>> _switch blocks_, the old willy-nilly style, and the new single-consequent
>> (arrow) style. Switches that choose arrow-style need not reason about
>> fallthrough.

>> Unfortunately, it also brought a new asymmetry; switch expressions must be total
>> (and you get enhanced type checking for this), but switch statements cannot be.
>> This is a shame, since the improved type checking for totality is one of the
>> best things about the improvements in switch, as a switch that is total by
>> virtue of actually covering all the cases acts as a tripwire against new enum
>> constants / permitted subtypes being added later, rather than papering it over
>> with a catch-all. We explored several ways to explicitly add back totality
>> checking, but this always felt like a hack, and requires the programmer to
>> remember to ask for this checking.

>> Our resolution here offers a path to true healing with minimal user impact, by
>> (temporarily) carving out the semantic space of old statement switches. A
>> "legacy switch" is a statement switch on a numeric primitive or its box, enum,
>> or string, and which contains no pattern labels (i.e., a statement switch that
>> is valid today.) Like expression switches, we will require non-legacy statement
>> switches to be exhaustive, and warn on non-exhaustive legacy switches. (To make
>> the warning go away, just insert a "default: " or "default: break" at the
>> bottom of the switch; not painful.) After some time, we should be able to make
>> this warning an error, which again is easy to mitigate with a single line. In
>> the end, all switch constructs will be total and type-checked for
>> exhaustiveness, and once done, the notion of "legacy switch" can be
>> garbage-collected.

>> Looking ahead to patterns in switch, we have several legacy considerations to
>> navigate:

>> - Fallthrough and bindings. While fallthrough is not inherently problematic
>> (though the choice of fallthrough-by-default was unfortunate), if a case label
>> introduces a pattern variable, then fallthrough to another case (at least one
>> that doesn't introduce the same pattern variable with the same type) makes
>> little sense, and such fallthrough has been outlawed.
>> - Scoping. The block of a switch is one big scope, rather than each case label
>> group being its own scope. (Again, one might call this a historical error,
>> since there's little good that comes from this.) With case labels introducing
>> variable declarations, this could have been a big problem, if one case polluted
>> later cases (forcing users to pick unique names for each binding in a switch
>> statement), but flow scopoing solves that one.
>> - Nulls. In Java 1.0, switching over reference types was not permitted, so we
>> didn't have to worry about this. In Java 5, autoboxing and enums meant we could
>> switch over some reference types, but for all of these, null was a "silly"
>> value so we didn't care about NPEing on null. In Java 7, when we added string
>> switch, we could have conceivably allowed `case null`, but instead chose to
>> follow the precedent set by Java 5. But once we introduce switches over any
>> type, with richer patterns, eagerly NPEing on null becomes much more
>> problematic. We've navigated this by say that switches can NPE on null if they
>> have no nullable cases; nullable cases are those that explicitly say "null",
>> and total patterns (which always come last since they dominate all others.) The
>> old rule of "switches throw on null" becomes "switches throw on null, except
>> when they say 'case null' or the bottom case is total." Default continues to
>> mean what it always did -- "anything not already matched, except null."

>> The new treatment of null actually would have fallen out of the decisions on
>> totality, had we not gotten there already via another path. Our notion of
>> totality accounts for "remainder", which includes things like novel subclasses
>> of sealed types that did not exist at compile time, which it would not be
>> reasonable to ask users to write code to deal with, and null fits into this
>> treatment as well. We type check that a switch is sufficiently total, and then
>> insert extra code to catch "silly" values that are not otherwise handled,
>> including null, and throw. (This also enables DA analysis to truly trust switch
>> totality.)

>> Where we land is a single unified switch construct that can be either a
>> statement or an expression; that can use either old-style flow (colon) or the
>> more constrained flow style (arrow); whose case labels can be constant,
>> patterns (including guarded patterns), or a mix of the two; which can accept
>> the legacy null-hostility behavior, or can override it by explicitly using
>> nullable case labels; and which are almost always type checked for totality
>> (with some temporary, legacy exceptions.) Fallthough is basically unchanged;
>> you can get fallthrough when using the old-style flow, but becomes less
>> important as fallthrough is (mostly) nonsensical in the presence of pattern
>> cases with bindings, and the compiler prevents this misuse. The distinction
>> between "legacy" switches and pattern switches is temporary, with a path to
>> getting to "all switches are total" over time.

>> I think we've done a remarkable job at rehabilitating this monster.

>> *Someone actually suggested using the syntax "new switch", on the basis that new
>> was already a keyword. Would not have aged well.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20210519/a8cadc37/attachment-0001.htm>