Switch expressions -- gathering the threads
Brian Goetz
brian.goetz at oracle.com
Mon Apr 9 19:14:47 UTC 2018
There's been some active discussion on "Is this the switch expression
construct we're looking for" over on amber-dev. Its a good time to take
stock of where we are, and identifying any loose ends.
## Approach
Our approach is driven not merely by the desire to have an expression
form of switch, but to make switch more generally useful as a multi-way
conditional construct. The biggest driver here of course is making it
work well with pattern matching. Pattern matching is a driver for better
handling of nulls and primitives (though these are also useful on their
own); additionally, the more useful we make switch, the more obvious the
cumbersomeness of its statement-orientation becomes. Pattern matching
also pushes hard on the somewhat unfortunate scoping behavior; a
straightforward interpretation of existing scoping of locals in switch
would not be very good for pattern bindings.
At first, given all the constraints of existing switches, we thought it
unlikely that we'd be able to get away with teaching switch some new
tricks, and would have to create a new construct (say, "match"). Bit by
bit, though, we were able to chip away at the accidental complexity of
the { constants, patterns } x { statement, expression } space, to the
point where it seemed practical to unify the construct.
Having a single construct has pros and cons. On the other hand, entities
should not be multipled without necessity; on the other, a
one-size-fits-all construct might exhibit schizoid behavior. And the
switch statement probably has more unusual (some would say
objectionable) behaviors than any other Java construct, putting us in
tension between compatibility and perceived complexity.
## Current proposal
The current proposal starts with existing statement switch, extending
`break` to support a value, and requiring that the value-ness of the
break match the value-ness of the switch (just as return must with
methods or lambdas). We also slightly adjust the rules regarding
nonlocal control flow _through_ a switch switch. Because expression
switches are expressions, they must be total. For expression switches
over enums and sealed types, we have the option to infer a throwing
default when all sealed members are provided.
We then offer a shorthand form for case labels in expression switches, that:
case P -> e;
is shorthand for
case P: break e;
This leaves the following differences between expression switches and
statement switches:
- Expression switches are required to be exhaustive; statement
switches cannot be required to be exhaustive.
- Expression switches permit the `->` shorthand form.
- Expression switches may restrict fallthrough in some way, or may
not, TBD.
- You can `return` and `continue` out of a statement switch, but not
out of an expression switch (like lambdas.)
- You cannot `break` or `continue` _through_ an expression switch
(like lambdas and conditionals.)
And leaves some open issues for discussion:
- We have some options as to whether to restrict fallthrough in
expression switches, and also whether to restrict fallthrough into
patterns.
- We have the option to try and give the `->` form some meaning in
statement switches.
## Commentary
The concerns raised so far mostly revolve around potential confusion.
Because the two forms are mostly alike, but have subtle differences, the
fear is this will lead to confusion. Various schemes have been suggested
to make them look more different, or to make them behave more different,
to make it more clear where the lines are.
For example, the following have been cited:
- Saying `break expression` is ugly, or confusable for a labeled break;
- Concerns that fallthrough-by-default is an even worse default for
expression switches than for statements (and, if we restrict fallthrough
in switch expression, the gap between the forms grows);
- The asymmetry of the implicit throwing default in
apparently-exhaustive enum switches will be a sharp edge;
- That a user might not be able to tell, by looking at the middle of a
large switch, whether its an expression or statement switch?
- The possibility people will write code with mixed label forms (colon
and arrow) seems to scare the heck out of people;
- The arrows might confuse people with similarity to lambdas.
My reaction to most of these is "meh". I think the arrow-form is going
to be so preferable that the risk of fallthrough will be low (because
there are few statements in the first place), and can be lowered further
with restrictions; similarly, I think unrestricted mixing of arrow and
colon forms will be quite rare (except for the case where there is one
catch-all case, often a default, which will take statement form, which
seems mostly harmless), and strongly discouraged. And that means that
the confusion between expression and statement will be nonexistent --
because the expression ones will have arrows and the statement ones will
not.
There are also a number of calls for "If X is rare, just disallow X",
where X could be a statement-plus-expression form in expression switches
or mixed label forms in one switch. The problem is that they are
usually not rare _enough_ that their lack would not cause a different
kind of backlash.
#### Some alternatives that have been suggested
**Separate keyword.** Having a separate keyword ("choose") for
expression switch seems like it should dispel all the "but people will
be confused" issues, but I'm not sure it actually will. Because the two
constructs will still be so similar, the differences will likely still
be surprises to people. It is also not a magic wand; we still have to
figure out how to deal with statement+expression compounds, and doesn't
automatically rule out the "mixed colons and arrows" problem.
**Block expression**. For the "mixed colons and arrows" problem,
several have suggested some sort of ad-hoc, switch-specific block
expression, but from a language evolution perspective, I think this is a
cure is worse than the disease. Having an ad-hoc form just for switch
is terrible, and adding a general block expression form to the language
is not where we want to go -- and doing it to avoid the perception of
rampant mixed colons-and-arrows would be killing a dust mite with a
napalm blast.
**No colons in expression switch.** Without a block expression, this is
a non-starter; there are way too many legitimate uses for compound
expressions in expressions witches.
**No mixed colons and arrows**. This will be intensely irritating to
users; if you add one compound expression in a 50-way switch, you have
to change 49 others from the nice form to the nasty one.
## Open issues
The main issue we need to address is whether we want to restrict
fallthrough in expression switches (or in the extreme case, prohibit it
entirely.)
One argument why fallthrough might be desirable is that some existing
statement switches that make use of fallthrough (such as string or
packet parsers) could become expression switches; these frequently have
a "main result" they want to return (such as the index of the next
character), while at the same time recording some side state about the
context. Refactoring these to expression switches could be beneficial
just as it is for many other statement switches. On the other hand, it
would also be reasonable say we should leave these cases in
statement-world where they are now.
A form of fallthrough that I think may be more common in expression
switches is when something wants to fall _into_ the default:
int x = switch (y) {
case "Foo" -> 1;
case "Bar" -> 2;
case null:
default:
// handle exceptional case here
}
Because `default` is not a pattern, we can't say:
case null, default:
here. (Well, we could make it one.) Though we could carve out an
exception for such "trivial" fallthrough.
I think a reasonable restriction that might preserve flexibility while
avoiding most accidental uses is to make it illegal to fall _into_ an
arrow-labeled case; if you want fallthrough, stay in colon-world. (It's
impossible to fall _out of_ an arrow case.) Given that most users would
rather live in arrow-world, this means that for practical purposes,
there's no fallthrough in expression switches at all, but advanced users
have a fallback that works just like the switch and fallthrough they've
always known.
While it is not specific to expression vs statement switch, we should
also ask whether we want to restrict fallthrough into certain kinds of
pattern labels (i.e., those without binding variables), even in
statement switch. (I don't really see the point, though; I don't see a
path to getting rid of the breaks, which would be the real payoff.)
Further, because of the intersection rules about OR pattern, its more
likely an accidental fallthrough from one pattern label to another would
result in a compile error anyway.
#### -> in statement switch
Finally, people have asked about whether we should consider allowing
`->` for statement switches too (perhaps on the theory that they're kind
of like void-valued expression switches.) I see the attraction here --
when the majority of actions are single-line, this would be a winner,
and you could drop the breaks. However, because the distribution of
statement count in switch arms is all over the map, this would
dramatically increase the the prevalence of mixed colon-and-arrow
switches, and probably further exposing people to the risk of accidental
fallthrough, as now break is needed sometimes and not others _in the
same statement switch_.
More information about the amber-spec-experts
mailing list