Opting into totality

Mon Aug 24 20:12:26 UTC 2020

> On Aug 24, 2020, at 1:30 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
> 
> The previous mail, on optimistic totality, only applied to switches that were already total.  Currently that includes only expression switches; we want a non-too-invasive way to mark a statement switch as total as well.  . . .

I am going to argue here that, just as fear of letting nulls flow stemmed from a early design that conflated multiple design issues as a result of extrapolating from too few data points (enums and strings), we have been boxed into another corner because we conflated expression-ness and the need for totality.  In this essay I will first tease these two issues apart, and then suggest how we might go forward using what we have learned from discussions of the last few weeks.

Going back to the dawn of time, a switch statement does not have to be total.  Why is this possible?  Because there is an obvious default behavior: do nothing.  If we were to view it in terms of delivering a value of some type, we would say that type is “void”.

Then why did we not allow a switch expression to be _exactly_ analogous?  In fact, we could have, by relying on existing precedent in the language: if no switch label matches and there is no default, or if execution of the statements of the switch block completes normally, we could simply decree that a switch expression has the default behavior “do nothing” and delivers a _default value_—exactly as we do for initialization of fields and array components.  So for

	enum Color { RED, GREEN, BLUE }
	Color x = …
	int n = switch (x) { RED -> 1; GREEN -> 2; };

then if x is BLUE, n will get the value 0.

But I am guessing that we worried about programming errors and demanded totality for switch expressions, so we enforced it by fiat because we had no other mechanism to request totality.

So, standing where we are today, first imagine that we relax the totality requirement of switch expressions and allow them to produce default values (zero or null) in exactly the same situation that a statement switch would “do nothing”.

Next, let us introduce pattern matching in switch labels, as we have discussed at length.

Then we introduce two mechanisms that we have discussed more recently, and say that each of these mechanisms may be used in either a switch statement or a switch expression.

The first is a switch label of the form “default <pattern>”, which behaves just like a switch label “case <pattern>” except that it is a static error if the <pattern> is not total on the type of the selector expression.  This mechanism is good for extensible type hierarchies, where we expect to call out a number of special cases and then have a catch-all case, and we want the compiler to confirm to us on every compilation that the catch-all case actually does catch everything.

The second is the possibility of writing “switch case” rather than “switch”, which introduces these extra constraints on the switch block: It is a static error if any SwitchLabel of the switch statement begins with “default".  It is a static error if the set of case patterns is not at least optimistically total on the type of the selector expression.  It is a static error if the last BlockStatement in _any_ SwitchBlockStatementGroup, or the Block in any SwitchRule, can complete normally.  It is a static error if any SwitchLabel of the switch statement is not part of a SwitchBlockStatementGroup or SwitchRule.  In addition, the compiler automatically inserts SwitchBlockStatementGroups or SwitchRules to cover the residue, so as to throw an appropriate error at run time if the value produced by the selector expression belongs to the residue.  This mechanism is good for enums and sealed types, that is, situations where we expect to enumerate all the special cases explicitly and want to be notified by the compiler (or failing that, at run time) if we have failed to do so.

In this way two distinct methods are provided for requesting totality checking (and note that they are mutually exclusive), and either may be used with either a switch statement switch or a switch expression.

At this stage, we have six possibilities, generated by an _orthogonal_ choice of (1) statement versus expression, and (2) use of “default <pattern>”, “switch case”, or neither.

But we are still justly worried that _one_ of these _six_ cases is error-prone: the possibility of switch expressions generating default values.  So we can rule that out again, but in a more principled way that still retains both orthogonality of choice and backward compatibility. We replace this line of the JLS:

	•  If the type of the selector expression is not an enum type, then there is exactly one default label associated with the switch block.

with this:

	•  If the type of the selector expression is not an enum type, then either the “switch case” form is used or there is exactly one default label associated with the switch block.

Furthermore, we retain the existing sentence in the description of the run-time evaluation of switch expressions that says "If no switch label matches, then an IncompatibleClassChangeError is thrown and the entire switch expression completes abruptly for that reason.”

In this way we have six orthogonally generated choices (instead of two non-orthogonally-generated possibilities), of which we then, for the sake of backward compatibility, allow the most dangerous one to be used only for enums, and add back the previously existing ICCE guardrail for that situation, so that switch expressions never generate default values after all.