Rehabilitating switch -- a scorecard
brian.goetz at oracle.com
Wed May 19 11:12:43 UTC 2021
So, here's another aspect of switches rehabilitation, this time in terms
of syntactic rewrites. By way of analogy with lambdas, there's a
x -> e // parens elided in unary lambda
(x) -> e // types elided
(var x) -> e // explicit request for inference
(<actual type> x) -> e // explicit types
That is, there is a canonical (lowest) form, and the various shorthands
form a chain of embeddings. The chain shape reduces cognitive load on
the user, because instead of thinking "there are seven forms of lambda",
they can instead think there is single canonical form, with progressive
options for leaving things out / mushing things together.
We get more of a funnel with the syntax of switch:
case L, J, K -> X;
case L, J, K: yield X; // expression switch, X is an expression
case L, J, K: X; // expression switch, X is a block
case L, J, K: X; break; // statement switch
case L, J, K: X;
On 5/17/2021 5:36 PM, Brian Goetz wrote:
> This is a good time to look at the progress we've made with switch.
> When we started looking at extending switch to support pattern
> matching (four years ago!) we identified a lot of challenges deriving
> from switch's C legacy, some of which is summarized here:
> We had two primary driving goals for improving switch: switches as
> expressions, and switches with patterns as labels. In turn, these
> pushed on a number of other uncomfortable aspects of switch: fall
> through, totality, scoping, and null handling.
> Initially, we were unsure we would be able to rehabilitate switch to
> support these new requirements without being forever bogged down by
> the mistakes of the past. Bit by bit, we have chipped away at the
> negative aspects of switch, while respecting the existing code that
> depends on those aspects. I think where we've landed is, in many
> ways, better than we could have initially hoped for.
> Throughout this exercise, there were periodic calls for "just toss it
> and invent something new" (which we sometimes called "snitch",
> shorthand for "new switch"*), and no shortage of people's attempts to
> design their ideal switch construct. We resisted this line of attack,
> because we believed having two similar-but-different constructs living
> side by side would be more annoying (and confusing) to users than a
> rehabilitated, albeit more complex, construct.
> The first round of improvements came with expression switches. This
> was the easy batch, because it didn't materially change the set of
> questions we could ask with switch, just the form in which we asked
> the question. This brought the following improvements:
> - Switches as expressions. Many existing switch statements are in
> reality modeling expressions, in a more roundabout and less safe way.
> Expressing it directly is simpler and less error-prone.
> - Checked totality. The compiler enforces that a switch expression
> is exhaustive (because, expressions must be total). In the case of
> enum switches, a switch that covers all the cases needs no default
> clause, and the compiler inserts an extra case to catch novel values
> and throw (ICCE) on them. (Eventually the same will be true for
> switches on sealed classes as well.)
> - A fallthrough-free option. Switches now give us a choice between
> two styles of _switch blocks_, the old willy-nilly style, and the new
> single-consequent (arrow) style. Switches that choose arrow-style
> need not reason about fallthrough.
> Unfortunately, it also brought a new asymmetry; switch expressions
> must be total (and you get enhanced type checking for this), but
> switch statements cannot be. This is a shame, since the improved type
> checking for totality is one of the best things about the improvements
> in switch, as a switch that is total by virtue of actually covering
> all the cases acts as a tripwire against new enum constants /
> permitted subtypes being added later, rather than papering it over
> with a catch-all. We explored several ways to explicitly add back
> totality checking, but this always felt like a hack, and requires the
> programmer to remember to ask for this checking.
> Our resolution here offers a path to true healing with minimal user
> impact, by (temporarily) carving out the semantic space of old
> statement switches. A "legacy switch" is a statement switch on a
> numeric primitive or its box, enum, or string, and which contains no
> pattern labels (i.e., a statement switch that is valid today.) Like
> expression switches, we will require non-legacy statement switches to
> be exhaustive, and warn on non-exhaustive legacy switches. (To make
> the warning go away, just insert a "default: " or "default: break" at
> the bottom of the switch; not painful.) After some time, we should be
> able to make this warning an error, which again is easy to mitigate
> with a single line. In the end, all switch constructs will be total
> and type-checked for exhaustiveness, and once done, the notion of
> "legacy switch" can be garbage-collected.
> Looking ahead to patterns in switch, we have several legacy
> considerations to navigate:
> - Fallthrough and bindings. While fallthrough is not inherently
> problematic (though the choice of fallthrough-by-default was
> unfortunate), if a case label introduces a pattern variable, then
> fallthrough to another case (at least one that doesn't introduce the
> same pattern variable with the same type) makes little sense, and such
> fallthrough has been outlawed.
> - Scoping. The block of a switch is one big scope, rather than each
> case label group being its own scope. (Again, one might call this a
> historical error, since there's little good that comes from this.)
> With case labels introducing variable declarations, this could have
> been a big problem, if one case polluted later cases (forcing users to
> pick unique names for each binding in a switch statement), but flow
> scopoing solves that one.
> - Nulls. In Java 1.0, switching over reference types was not
> permitted, so we didn't have to worry about this. In Java 5,
> autoboxing and enums meant we could switch over some reference types,
> but for all of these, null was a "silly" value so we didn't care about
> NPEing on null. In Java 7, when we added string switch, we could have
> conceivably allowed `case null`, but instead chose to follow the
> precedent set by Java 5. But once we introduce switches over any
> type, with richer patterns, eagerly NPEing on null becomes much more
> problematic. We've navigated this by say that switches can NPE on
> null if they have no nullable cases; nullable cases are those that
> explicitly say "null", and total patterns (which always come last
> since they dominate all others.) The old rule of "switches throw on
> null" becomes "switches throw on null, except when they say 'case
> null' or the bottom case is total." Default continues to mean what it
> always did -- "anything not already matched, except null."
> The new treatment of null actually would have fallen out of the
> decisions on totality, had we not gotten there already via another
> path. Our notion of totality accounts for "remainder", which includes
> things like novel subclasses of sealed types that did not exist at
> compile time, which it would not be reasonable to ask users to write
> code to deal with, and null fits into this treatment as well. We type
> check that a switch is sufficiently total, and then insert extra code
> to catch "silly" values that are not otherwise handled, including
> null, and throw. (This also enables DA analysis to truly trust switch
> Where we land is a single unified switch construct that can be either
> a statement or an expression; that can use either old-style flow
> (colon) or the more constrained flow style (arrow); whose case labels
> can be constant, patterns (including guarded patterns), or a mix of
> the two; which can accept the legacy null-hostility behavior, or can
> override it by explicitly using nullable case labels; and which are
> almost always type checked for totality (with some temporary, legacy
> exceptions.) Fallthough is basically unchanged; you can get
> fallthrough when using the old-style flow, but becomes less important
> as fallthrough is (mostly) nonsensical in the presence of pattern
> cases with bindings, and the compiler prevents this misuse. The
> distinction between "legacy" switches and pattern switches is
> temporary, with a path to getting to "all switches are total" over time.
> I think we've done a remarkable job at rehabilitating this monster.
> *Someone actually suggested using the syntax "new switch", on the
> basis that new was already a keyword. Would not have aged well.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the amber-spec-experts