Totality at switch statements

Sun Jun 19 13:31:20 UTC 2022

> Surely, the totality is necessary at switch expressions, but forcing 
> it at a statement is questionable.

I understand why you might feel this way; indeed, when we made this 
decision, I knew that we would get mails like this one.  There were a 
number of forces that pushed us in this direction; I will try to explain 
some of them, but I don't expect the explanation to be compelling to all 
readers.

One of the things that makes upgrading `switch` difficult is that 
developers have an existing mental model of what switch "is" or "is 
for", and attempted upgrades often challenge these models.  But the goal 
here is not simply to make switch "better" in a number of ways (better 
exhaustiveness checking, better null handling, patterns, smoother 
syntax).  Of course we like these improvements, but the goal is bigger.  
But it's easier to be aware of the role switch plays today, than the 
role we expect it to play tomorrow, so these improvements might feel 
more like "forced improvements for the sake of some abstract ivory tower 
purity."

> It can be helpful for the programmers to check the totality in those 
> cases when the intent is that. But it is quite common to create a 
> switch statement that doesn't handle all of the possibilities.

This raises two questions:

  - Why is it so common?
  - Is it good that it is common?

One of the reasons it is common is that the switch statement we had is 
so weak!  The set of types you can switch over is limited.  The set of 
things you can put in case labels is limited.  The set of things you can 
do is limited (until recently, you were forced to do everything with 
side-effects.)  The switch statement we have in Java was copied almost 
literally from C, which was designed for writing things like lexers that 
look at characters and accumulate them conditionally into buffers.  
Partiality is common in these weak use cases, but in the use cases we 
want to support, partiality is more often a bug than a feature.  So 
saying "it is common" is really saying "this is what we've managed to do 
with the broken, limited switch statement we have today."  Great that 
we've been able to do something with it, but we shouldn't limit our 
imagination to what we've been able to achieve with such a limited tool.

To my other question, try this thought experiment: if switch was total 
from day 1, requiring a `default: ;` case to terminate the switch (which 
really isn't that burdensome), when you were learning it back then, 
would you even have thought to complain?  Would you even have 
*noticed*?  If there was a budget for complaining about switch in that 
hypothetical world, I would think 99% of it would have been spent on 
fallthrough, rather than "forced default".

> Why would it be different using patterns? Why is it beneficial to 
> force totality? 

Because patterns don't exist in a vacuum.  There's a reason we did 
records, sealed classes, and patterns together; because they work 
together.  Records let us easily model aggregates, and sealed types let 
us easily model exhaustive choices (records + sealed classes = algebraic 
data types); record patterns make it easy to recover the aggregated 
state, and exhaustive switches make it easy to recover the original 
choice.  We expect that the things people are going to switch over with 
patterns, will have often been modeled with sealed classes.

Java has succeeded despite having gotten many of the defaults wrong.  
We've all had to learn "make things private unless they need to be 
public."  "Make fields final unless they need to be mutable." It would 
have been nice if the language gave us more of a nudge, but we had to 
learn the hard way.  Switch partiality is indeed another of those wrong 
defaults; you don't notice it until it is pointed out to you, but then 
when you think about it for enough time, you realize what a mistake it was.

In the current world, partiality is the default, and even if a switch is 
total, it may not be obvious (unless you explicitly say "default"); 
flipping this around, a switch with default is a sign that says "hey, 
I'm partial."  Partiality is an error-prone condition that is worth 
calling attention to, so flipping this default is valuable -- and we 
have an opportunity to do so without breaking compatibility.

The value of totality checking is under-appreciated, in part because 
until recently there were so few sources of exhaustiveness information 
in the language (basically, enums).  But many non-exhaustive switches 
are an error waiting to happen; the user thinks they have covered all 
the cases (either in fact, or in practicality).  But something may 
happen elsewhere that undermines this assumption (e.g., a new enum 
constant or subtype was added.) With totality, we are made aware of this 
immediately, rather than having to debug the runtime case where a 
surprising value showed up.

Worse, now the language has switch expressions, which *must* be total.  
Having one kind of switch be total and another not is cognitive load 
that users (and students) have to carry.  (Yes, there are still 
asymmetries in switch that have this effect; that's not a reason to load 
up with more.)  But it gets even worse, because refactoring from a 
switch expression to a switch statement means you lost some safety that 
you were depending on when you wrote the original code, and may not be 
aware of this.

If you've not programmed with sealed types or equivalent, it is easy to 
underestimate how powerful this is.  I'd like us to be able to get to a 
world where we almost never use "default" in switch, unless we are 
deliberately opting into partiality -- in which case the "default" is a 
reminder to future maintainers that this is a deliberately partial switch.

> This check can be an IDE feature.

Yes, that was one of the choices.  And we considered that.  And it is 
reasonable that you wish we'd made another choice.  But, be aware you 
are really arguing "make the language less safe and more error-prone, 
please" -- and ask yourself why you think its a good idea to make the 
language less uniform and less safe?  I think you'll find that the 
reason is mostly "someone moved my cheese." 
(https://en.wikipedia.org/wiki/Who_Moved_My_Cheese%3F).

> Honestly I feel that the rule, when the totality is forced, is 
> dictated simply by the necessity of backward compatibility. What will 
> happen if a new type (for example double) will be allowed for the 
> selector expression? The backward compatibility wouldn't be an issue, 
> but it would be weird behaving differently with an int compared to a 
> double, so I guess the totality won't be forced. What would happen if 
> the finality requirement was removed, and the equality could be 
> checked for all types? What about the totality check in this imagined 
> future?

I don't really understand what you're getting at in this paragraph, but 
I'll just point out that it really underscores the value of a uniform 
switch construct.  You're saying "but I don't see how you'll get there 
from legacy int switches, so therefore its inconsistent" (and 
implicitly, one unit of inconsistency is as bad as a million.) But you 
don't seem to be equally bothered by the much more impactful 
inconsistency we'd have if expression switches were total and statement 
switches were not.

You are correct that there are legacy considerations that will make it 
harder to get to a fully uniform construct (but, there's still things we 
can do there.)  But that's not an excuse to not design towards the 
language we want to have, when we can do so at such minor inconvenience.

But the main thing I want you to consider is: right now, the switch we 
have is very, very limited, and so we've convinced ourselves it is "for" 
the few things we've been able to do with it.  By making it more 
powerful (and combining it with complementary features such as pattern 
matching, and sealing), these few cases -- which right now feel like the 
whole world of switch -- will eventually recede into being the quirky 
odd cases.

> Additionally, there are issues with the "empty" default clause. In the 
> JEP the "default: break;" was recommended, but interestingly it 
> doesn't work with the arrow syntax. ("default -> break;" is a compile 
> time error, only the "default: {break;}" is possible.) We can use both 
> the "default: {}" and "default -> {}", which is fine. But while the 
> "default:" is possible (without body), the "default ->" is an error. I 
> don't know what is the reason behind it. Allowing an empty body with 
> the arrow syntax would make the actual solution a little bit cleaner.

This is a fair observation; this should probably be cleaned up.

> It would be possible to allow the programmer to mark the intended 
> totality. Maybe a new keyword would be too much for this purpose.

Yes, we considered this, and came to the conclusion that the problem is 
the wrong default.  Adding a new keyword for "total switch" is bad in 
three ways: it is, as you say, "too much"; it doesn't fix the underlying 
problem; and the cases in which it most needs to be used, people will 
probably forget to use it.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20220619/3ed3598b/attachment-0001.htm>