Switch expressions -- gathering the threads

Mon Apr 9 19:14:47 UTC 2018

There's been some active discussion on "Is this the switch expression 
construct we're looking for" over on amber-dev.  Its a good time to take 
stock of where we are, and identifying any loose ends.

## Approach

Our approach is driven not merely by the desire to have an expression 
form of switch, but to make switch more generally useful as a multi-way 
conditional construct.  The biggest driver here of course is making it 
work well with pattern matching. Pattern matching is a driver for better 
handling of nulls and primitives (though these are also useful on their 
own); additionally, the more useful we make switch, the more obvious the 
cumbersomeness of its statement-orientation becomes.  Pattern matching 
also pushes hard on the somewhat unfortunate scoping behavior; a 
straightforward interpretation of existing scoping of locals in switch 
would not be very good for pattern bindings.

At first, given all the constraints of existing switches, we thought it 
unlikely that we'd be able to get away with teaching switch some new 
tricks, and would have to create a new construct (say, "match").  Bit by 
bit, though, we were able to chip away at the accidental complexity of 
the { constants, patterns } x { statement, expression } space, to the 
point where it seemed practical to unify the construct.

Having a single construct has pros and cons. On the other hand, entities 
should not be multipled without necessity; on the other, a 
one-size-fits-all construct might exhibit schizoid behavior.  And the 
switch statement probably has more unusual (some would say 
objectionable) behaviors than any other Java construct, putting us in 
tension between compatibility and perceived complexity.

## Current proposal

The current proposal starts with existing statement switch, extending 
`break` to support a value, and requiring that the value-ness of the 
break match the value-ness of the switch (just as return must with 
methods or lambdas).  We also slightly adjust the rules regarding 
nonlocal control flow _through_ a switch switch.  Because expression 
switches are expressions, they must be total.  For expression switches 
over enums and sealed types, we have the option to infer a throwing 
default when all sealed members are provided.

We then offer a shorthand form for case labels in expression switches, that:

     case P -> e;

is shorthand for

     case P: break e;

This leaves the following differences between expression switches and 
statement switches:
  - Expression switches are required to be exhaustive; statement 
switches cannot be required to be exhaustive.
  - Expression switches permit the `->` shorthand form.
  - Expression switches may restrict fallthrough in some way, or may 
not, TBD.
  - You can `return` and `continue` out of a statement switch, but not 
out of an expression switch (like lambdas.)
  - You cannot `break` or `continue` _through_ an expression switch 
(like lambdas and conditionals.)

And leaves some open issues for discussion:

  - We have some options as to whether to restrict fallthrough in 
expression switches, and also whether to restrict fallthrough into 
patterns.
  - We have the option to try and give the `->` form some meaning in 
statement switches.

## Commentary

The concerns raised so far mostly revolve around potential confusion.  
Because the two forms are mostly alike, but have subtle differences, the 
fear is this will lead to confusion. Various schemes have been suggested 
to make them look more different, or to make them behave more different, 
to make it more clear where the lines are.

For example, the following have been cited:

  - Saying `break expression` is ugly, or confusable for a labeled break;
  - Concerns that fallthrough-by-default is an even worse default for 
expression switches than for statements (and, if we restrict fallthrough 
in switch expression, the gap between the forms grows);
  - The asymmetry of the implicit throwing default in 
apparently-exhaustive enum switches will be a sharp edge;
  - That a user might not be able to tell, by looking at the middle of a 
large switch, whether its an expression or statement switch?
  - The possibility people will write code with mixed label forms (colon 
and arrow) seems to scare the heck out of people;
  - The arrows might confuse people with similarity to lambdas.

My reaction to most of these is "meh".  I think the arrow-form is going 
to be so preferable that the risk of fallthrough will be low (because 
there are few statements in the first place), and can be lowered further 
with restrictions; similarly, I think unrestricted mixing of arrow and 
colon forms will be quite rare (except for the case where there is one 
catch-all case, often a default, which will take statement form, which 
seems mostly harmless), and strongly discouraged.  And that means that 
the confusion between expression and statement will be nonexistent -- 
because the expression ones will have arrows and the statement ones will 
not.

There are also a number of calls for "If X is rare, just disallow X", 
where X could be a statement-plus-expression form in expression switches 
or mixed label forms in one switch.  The problem is that they are 
usually not rare _enough_ that their lack would not cause a different 
kind of backlash.

#### Some alternatives that have been suggested

**Separate keyword.**  Having a separate keyword ("choose") for 
expression switch seems like it should dispel all the "but people will 
be confused" issues, but I'm not sure it actually will. Because the two 
constructs will still be so similar, the differences will likely still 
be surprises to people. It is also not a magic wand; we still have to 
figure out how to deal with statement+expression compounds, and doesn't 
automatically rule out the "mixed colons and arrows" problem.

**Block expression**.  For the "mixed colons and arrows" problem, 
several have suggested some sort of ad-hoc, switch-specific block 
expression, but from a language evolution perspective, I think this is a 
cure is worse than the disease.  Having an ad-hoc form just for switch 
is terrible, and adding a general block expression form to the language 
is not where we want to go -- and doing it to avoid the perception of 
rampant mixed colons-and-arrows would be killing a dust mite with a 
napalm blast.

**No colons in expression switch.**  Without a block expression, this is 
a non-starter; there are way too many legitimate uses for compound 
expressions in expressions witches.

**No mixed colons and arrows**.  This will be intensely irritating to 
users; if you add one compound expression in a 50-way switch, you have 
to change 49 others from the nice form to the nasty one.

## Open issues

The main issue we need to address is whether we want to restrict 
fallthrough in expression switches (or in the extreme case, prohibit it 
entirely.)

One argument why fallthrough might be desirable is that some existing 
statement switches that make use of fallthrough (such as string or 
packet parsers) could become expression switches; these frequently have 
a "main result" they want to return (such as the index of the next 
character), while at the same time recording some side state about the 
context.  Refactoring these to expression switches could be beneficial 
just as it is for many other statement switches.  On the other hand, it 
would also be reasonable say we should leave these cases in 
statement-world where they are now.

A form of fallthrough that I think may be more common in expression 
switches is when something wants to fall _into_ the default:

     int x = switch (y) {
         case "Foo" -> 1;
         case "Bar" -> 2;

         case null:
         default:
             // handle exceptional case here
     }

Because `default` is not a pattern, we can't say:

     case null, default:

here.  (Well, we could make it one.)  Though we could carve out an 
exception for such "trivial" fallthrough.

I think a reasonable restriction that might preserve flexibility while 
avoiding most accidental uses is to make it illegal to fall _into_ an 
arrow-labeled case; if you want fallthrough, stay in colon-world.  (It's 
impossible to fall _out of_ an arrow case.) Given that most users would 
rather live in arrow-world, this means that for practical purposes, 
there's no fallthrough in expression switches at all, but advanced users 
have a fallback that works just like the switch and fallthrough they've 
always known.

While it is not specific to expression vs statement switch, we should 
also ask whether we want to restrict fallthrough into certain kinds of 
pattern labels (i.e., those without binding variables), even in 
statement switch.  (I don't really see the point, though; I don't see a 
path to getting rid of the breaks, which would be the real payoff.)  
Further, because of the intersection rules about OR pattern, its more 
likely an accidental fallthrough from one pattern label to another would 
result in a compile error anyway.

#### -> in statement switch

Finally, people have asked about whether we should consider allowing 
`->` for statement switches too (perhaps on the theory that they're kind 
of like void-valued expression switches.)  I see the attraction here -- 
when the majority of actions are single-line, this would be a winner, 
and you could drop the breaks.  However, because the distribution of 
statement count in switch arms is all over the map, this would 
dramatically increase the the prevalence of mixed colon-and-arrow 
switches, and probably further exposing people to the risk of accidental 
fallthrough, as now break is needed sometimes and not others _in the 
same statement switch_.