Rehabilitating switch -- a scorecard

Remi Forax forax at univ-mlv.fr
Mon May 17 22:08:09 UTC 2021


> De: "Brian Goetz" <brian.goetz at oracle.com>
> À: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Envoyé: Lundi 17 Mai 2021 23:36:30
> Objet: Rehabilitating switch -- a scorecard

> This is a good time to look at the progress we've made with switch. When we
> started looking at extending switch to support pattern matching (four years
> ago!) we identified a lot of challenges deriving from switch's C legacy, some
> of which is summarized here:

> [ http://cr.openjdk.java.net/~briangoetz/amber/switch-rehab.html |
> http://cr.openjdk.java.net/~briangoetz/amber/switch-rehab.html ]

> We had two primary driving goals for improving switch: switches as expressions,
> and switches with patterns as labels. In turn, these pushed on a number of
> other uncomfortable aspects of switch: fall through, totality, scoping, and
> null handling.

> Initially, we were unsure we would be able to rehabilitate switch to support
> these new requirements without being forever bogged down by the mistakes of the
> past. Bit by bit, we have chipped away at the negative aspects of switch, while
> respecting the existing code that depends on those aspects. I think where we've
> landed is, in many ways, better than we could have initially hoped for.

> Throughout this exercise, there were periodic calls for "just toss it and invent
> something new" (which we sometimes called "snitch", shorthand for "new
> switch"*), and no shortage of people's attempts to design their ideal switch
> construct. We resisted this line of attack, because we believed having two
> similar-but-different constructs living side by side would be more annoying
> (and confusing) to users than a rehabilitated, albeit more complex, construct.

> The first round of improvements came with expression switches. This was the easy
> batch, because it didn't materially change the set of questions we could ask
> with switch, just the form in which we asked the question. This brought the
> following improvements:

> - Switches as expressions. Many existing switch statements are in reality
> modeling expressions, in a more roundabout and less safe way. Expressing it
> directly is simpler and less error-prone.
> - Checked totality. The compiler enforces that a switch expression is exhaustive
> (because, expressions must be total). In the case of enum switches, a switch
> that covers all the cases needs no default clause, and the compiler inserts an
> extra case to catch novel values and throw (ICCE) on them. (Eventually the same
> will be true for switches on sealed classes as well.)
> - A fallthrough-free option. Switches now give us a choice between two styles of
> _switch blocks_, the old willy-nilly style, and the new single-consequent
> (arrow) style. Switches that choose arrow-style need not reason about
> fallthrough.

> Unfortunately, it also brought a new asymmetry; switch expressions must be total
> (and you get enhanced type checking for this), but switch statements cannot be.
> This is a shame, since the improved type checking for totality is one of the
> best things about the improvements in switch, as a switch that is total by
> virtue of actually covering all the cases acts as a tripwire against new enum
> constants / permitted subtypes being added later, rather than papering it over
> with a catch-all. We explored several ways to explicitly add back totality
> checking, but this always felt like a hack, and requires the programmer to
> remember to ask for this checking.

> Our resolution here offers a path to true healing with minimal user impact, by
> (temporarily) carving out the semantic space of old statement switches. A
> "legacy switch" is a statement switch on a numeric primitive or its box, enum,
> or string, and which contains no pattern labels (i.e., a statement switch that
> is valid today.) Like expression switches, we will require non-legacy statement
> switches to be exhaustive, and warn on non-exhaustive legacy switches. (To make
> the warning go away, just insert a "default: " or "default: break" at the
> bottom of the switch; not painful.) After some time, we should be able to make
> this warning an error, which again is easy to mitigate with a single line. In
> the end, all switch constructs will be total and type-checked for
> exhaustiveness, and once done, the notion of "legacy switch" can be
> garbage-collected.

> Looking ahead to patterns in switch, we have several legacy considerations to
> navigate:

> - Fallthrough and bindings. While fallthrough is not inherently problematic
> (though the choice of fallthrough-by-default was unfortunate), if a case label
> introduces a pattern variable, then fallthrough to another case (at least one
> that doesn't introduce the same pattern variable with the same type) makes
> little sense, and such fallthrough has been outlawed.
> - Scoping. The block of a switch is one big scope, rather than each case label
> group being its own scope. (Again, one might call this a historical error,
> since there's little good that comes from this.) With case labels introducing
> variable declarations, this could have been a big problem, if one case polluted
> later cases (forcing users to pick unique names for each binding in a switch
> statement), but flow scopoing solves that one.
> - Nulls. In Java 1.0, switching over reference types was not permitted, so we
> didn't have to worry about this. In Java 5, autoboxing and enums meant we could
> switch over some reference types, but for all of these, null was a "silly"
> value so we didn't care about NPEing on null. In Java 7, when we added string
> switch, we could have conceivably allowed `case null`, but instead chose to
> follow the precedent set by Java 5. But once we introduce switches over any
> type, with richer patterns, eagerly NPEing on null becomes much more
> problematic. We've navigated this by say that switches can NPE on null if they
> have no nullable cases; nullable cases are those that explicitly say "null",
> and total patterns (which always come last since they dominate all others.) The
> old rule of "switches throw on null" becomes "switches throw on null, except
> when they say 'case null' or the bottom case is total." Default continues to
> mean what it always did -- "anything not already matched, except null."

> The new treatment of null actually would have fallen out of the decisions on
> totality, had we not gotten there already via another path. Our notion of
> totality accounts for "remainder", which includes things like novel subclasses
> of sealed types that did not exist at compile time, which it would not be
> reasonable to ask users to write code to deal with, and null fits into this
> treatment as well. We type check that a switch is sufficiently total, and then
> insert extra code to catch "silly" values that are not otherwise handled,
> including null, and throw. (This also enables DA analysis to truly trust switch
> totality.)

> Where we land is a single unified switch construct that can be either a
> statement or an expression; that can use either old-style flow (colon) or the
> more constrained flow style (arrow); whose case labels can be constant,
> patterns (including guarded patterns), or a mix of the two; which can accept
> the legacy null-hostility behavior, or can override it by explicitly using
> nullable case labels; and which are almost always type checked for totality
> (with some temporary, legacy exceptions.) Fallthough is basically unchanged;
> you can get fallthrough when using the old-style flow, but becomes less
> important as fallthrough is (mostly) nonsensical in the presence of pattern
> cases with bindings, and the compiler prevents this misuse. The distinction
> between "legacy" switches and pattern switches is temporary, with a path to
> getting to "all switches are total" over time.

> I think we've done a remarkable job at rehabilitating this monster.
I believe the only pending issue on that matter is the position of default inside the switch, 
With the legacy switch, default can be in the middle, with a switch on types that default has to be the last case. 

I think we should try to emit a warning if "default" is not at last position, both Eclipse and IntelliJ already have that warning. 

Rémi 

> *Someone actually suggested using the syntax "new switch", on the basis that new
> was already a keyword. Would not have aged well.
if we add a prefix "new" to switch for each LTS release, e.g. new new switch for 6 years after 2018, it would help the future historians because radiocarbon dating does not work well on the source code. 


More information about the amber-spec-observers mailing list