Totality at switch statements
Brian Goetz
brian.goetz at oracle.com
Sun Jun 19 13:31:20 UTC 2022
> Surely, the totality is necessary at switch expressions, but forcing
> it at a statement is questionable.
I understand why you might feel this way; indeed, when we made this
decision, I knew that we would get mails like this one. There were a
number of forces that pushed us in this direction; I will try to explain
some of them, but I don't expect the explanation to be compelling to all
readers.
One of the things that makes upgrading `switch` difficult is that
developers have an existing mental model of what switch "is" or "is
for", and attempted upgrades often challenge these models. But the goal
here is not simply to make switch "better" in a number of ways (better
exhaustiveness checking, better null handling, patterns, smoother
syntax). Of course we like these improvements, but the goal is bigger.
But it's easier to be aware of the role switch plays today, than the
role we expect it to play tomorrow, so these improvements might feel
more like "forced improvements for the sake of some abstract ivory tower
purity."
> It can be helpful for the programmers to check the totality in those
> cases when the intent is that. But it is quite common to create a
> switch statement that doesn't handle all of the possibilities.
This raises two questions:
- Why is it so common?
- Is it good that it is common?
One of the reasons it is common is that the switch statement we had is
so weak! The set of types you can switch over is limited. The set of
things you can put in case labels is limited. The set of things you can
do is limited (until recently, you were forced to do everything with
side-effects.) The switch statement we have in Java was copied almost
literally from C, which was designed for writing things like lexers that
look at characters and accumulate them conditionally into buffers.
Partiality is common in these weak use cases, but in the use cases we
want to support, partiality is more often a bug than a feature. So
saying "it is common" is really saying "this is what we've managed to do
with the broken, limited switch statement we have today." Great that
we've been able to do something with it, but we shouldn't limit our
imagination to what we've been able to achieve with such a limited tool.
To my other question, try this thought experiment: if switch was total
from day 1, requiring a `default: ;` case to terminate the switch (which
really isn't that burdensome), when you were learning it back then,
would you even have thought to complain? Would you even have
*noticed*? If there was a budget for complaining about switch in that
hypothetical world, I would think 99% of it would have been spent on
fallthrough, rather than "forced default".
> Why would it be different using patterns? Why is it beneficial to
> force totality?
Because patterns don't exist in a vacuum. There's a reason we did
records, sealed classes, and patterns together; because they work
together. Records let us easily model aggregates, and sealed types let
us easily model exhaustive choices (records + sealed classes = algebraic
data types); record patterns make it easy to recover the aggregated
state, and exhaustive switches make it easy to recover the original
choice. We expect that the things people are going to switch over with
patterns, will have often been modeled with sealed classes.
Java has succeeded despite having gotten many of the defaults wrong.
We've all had to learn "make things private unless they need to be
public." "Make fields final unless they need to be mutable." It would
have been nice if the language gave us more of a nudge, but we had to
learn the hard way. Switch partiality is indeed another of those wrong
defaults; you don't notice it until it is pointed out to you, but then
when you think about it for enough time, you realize what a mistake it was.
In the current world, partiality is the default, and even if a switch is
total, it may not be obvious (unless you explicitly say "default");
flipping this around, a switch with default is a sign that says "hey,
I'm partial." Partiality is an error-prone condition that is worth
calling attention to, so flipping this default is valuable -- and we
have an opportunity to do so without breaking compatibility.
The value of totality checking is under-appreciated, in part because
until recently there were so few sources of exhaustiveness information
in the language (basically, enums). But many non-exhaustive switches
are an error waiting to happen; the user thinks they have covered all
the cases (either in fact, or in practicality). But something may
happen elsewhere that undermines this assumption (e.g., a new enum
constant or subtype was added.) With totality, we are made aware of this
immediately, rather than having to debug the runtime case where a
surprising value showed up.
Worse, now the language has switch expressions, which *must* be total.
Having one kind of switch be total and another not is cognitive load
that users (and students) have to carry. (Yes, there are still
asymmetries in switch that have this effect; that's not a reason to load
up with more.) But it gets even worse, because refactoring from a
switch expression to a switch statement means you lost some safety that
you were depending on when you wrote the original code, and may not be
aware of this.
If you've not programmed with sealed types or equivalent, it is easy to
underestimate how powerful this is. I'd like us to be able to get to a
world where we almost never use "default" in switch, unless we are
deliberately opting into partiality -- in which case the "default" is a
reminder to future maintainers that this is a deliberately partial switch.
> This check can be an IDE feature.
Yes, that was one of the choices. And we considered that. And it is
reasonable that you wish we'd made another choice. But, be aware you
are really arguing "make the language less safe and more error-prone,
please" -- and ask yourself why you think its a good idea to make the
language less uniform and less safe? I think you'll find that the
reason is mostly "someone moved my cheese."
(https://en.wikipedia.org/wiki/Who_Moved_My_Cheese%3F).
> Honestly I feel that the rule, when the totality is forced, is
> dictated simply by the necessity of backward compatibility. What will
> happen if a new type (for example double) will be allowed for the
> selector expression? The backward compatibility wouldn't be an issue,
> but it would be weird behaving differently with an int compared to a
> double, so I guess the totality won't be forced. What would happen if
> the finality requirement was removed, and the equality could be
> checked for all types? What about the totality check in this imagined
> future?
I don't really understand what you're getting at in this paragraph, but
I'll just point out that it really underscores the value of a uniform
switch construct. You're saying "but I don't see how you'll get there
from legacy int switches, so therefore its inconsistent" (and
implicitly, one unit of inconsistency is as bad as a million.) But you
don't seem to be equally bothered by the much more impactful
inconsistency we'd have if expression switches were total and statement
switches were not.
You are correct that there are legacy considerations that will make it
harder to get to a fully uniform construct (but, there's still things we
can do there.) But that's not an excuse to not design towards the
language we want to have, when we can do so at such minor inconvenience.
But the main thing I want you to consider is: right now, the switch we
have is very, very limited, and so we've convinced ourselves it is "for"
the few things we've been able to do with it. By making it more
powerful (and combining it with complementary features such as pattern
matching, and sealing), these few cases -- which right now feel like the
whole world of switch -- will eventually recede into being the quirky
odd cases.
> Additionally, there are issues with the "empty" default clause. In the
> JEP the "default: break;" was recommended, but interestingly it
> doesn't work with the arrow syntax. ("default -> break;" is a compile
> time error, only the "default: {break;}" is possible.) We can use both
> the "default: {}" and "default -> {}", which is fine. But while the
> "default:" is possible (without body), the "default ->" is an error. I
> don't know what is the reason behind it. Allowing an empty body with
> the arrow syntax would make the actual solution a little bit cleaner.
This is a fair observation; this should probably be cleaned up.
> It would be possible to allow the programmer to mark the intended
> totality. Maybe a new keyword would be too much for this purpose.
Yes, we considered this, and came to the conclusion that the problem is
the wrong default. Adding a new keyword for "total switch" is bad in
three ways: it is, as you say, "too much"; it doesn't fix the underlying
problem; and the cases in which it most needs to be used, people will
probably forget to use it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20220619/3ed3598b/attachment-0001.htm>
More information about the amber-dev
mailing list