Totality at switch statements

Remi Forax forax at univ-mlv.fr
Sun Jun 19 12:32:07 UTC 2022


> From: "Hunor Szegi" <hunor.szegi at gmail.com>
> To: amber-dev at openjdk.org
> Sent: Sunday, June 19, 2022 12:42:12 AM
> Subject: Totality at switch statements
Hi Hunor, 
thanks for your feedback. 

Let's start to understand why we want pattern matching, until now, if you want to describe an algorithm on a recursive data structure, you put the data and the parts of the algorithm in the same place ; you declare an interface with an abstract method corresponding to the algorithm and you implement the interface with different classes/records. 
In that case, the code of the algorithm is as important as the data, they are intermingled. 

Now, we want to introduce pattern matching, the idea of the pattern matching is to separate the data from the algorithm, the data are still defined using an interface and records, but the algorithm is not defined as an abstract method of the interface but as a method outside the data definition. You may ask why doing that, it's because the definition of the data is more important than the algorithm itself. Unlike with an abstract method which tied the data and the algorithm, here the data structure more important than the code itself. 

We have decided to express the pattern matching using switch, this means retrofitting the switch this new purpose. 
We have done that in several moves, first fix the semantics inherited from the C by adding the arrow syntax, then extends the semantics to allow switch expression and add patterns on types (type pattern and record patterns). All these moves as to be backward compatible so switch statement on int, String and enum must be still work as before. 

Totality is essential for pattern matching, it's the way to detect discrepancies between the definition of the data and the code of the pattern matching itself. Unlike with an abstract method, with pattern matching the data and the algorithm are defined in two different places so the data can be updated without the switch being updated. We want the compiler to fail in that case. 
So all the new switch constructs (switch on null, switch with guards, switch on patterns, switch expression) fails if the switch is not total, detecting when the data and the switch are not aligned anymore. 

This means that using "default" is now a blunder because while it makes the switch total it disable the compiler to warn the user when the data change. 
In the details, it's not fully true because we can switch on an infinity of values (or on a range big enough that it's like infinity in practice) so we can not enumerate all possibles combinations. 
Switches on ints and strings are in this category, so for them, there is no other solution than having a default. In the future a warning may be a good idea to ask for a default at the end of the switch. 
For the switch on enum, a user can enumerate all the cases but because of backward compatibility, we need a backward compatible way to opt-in to totality, my preferred way to do that is to add a "case null -> throw null", which transforms the old switch on an enum to a new switch on an enum which asks for totality. 

Now, to answer your questions, yes we want totality even in a switch statement to detect when the data change, default is kind of an anti-pattern so i don't see the need to have a special syntax for default -> {}, i prefer to keep the lambda syntax and the switch syntax aligned. Forcing totality on arrow syntax is something we have debated and we believe it's better to separate the two concerns: fix the semantics inherited from the C and totality, among other things it allows automatic conversion by tools between the colon syntax to the arrow syntax. Anyway, given that the arrow syntax is not a preview anymore, this option is not on the table anymore. About @total, you can use "case null -> throw null" instead. We may come with a true syntax later but not using an annotation because we have this rule that an annotation can not changed the semantics of the language. 

I hope this helps to understand why we wan totality, 
regards, 
Rémi 

> Hi All,

> I really like the progress of adding pattern matching to Java (and a lot of
> other features). I can not wait to use it in productive code (non-preview). But
> I would like to give you feedback about a tiny detail, about the totality of
> switch statements.

> Surely, the totality is necessary at switch expressions, but forcing it at a
> statement is questionable.

> It can be helpful for the programmers to check the totality in those cases when
> the intent is that. But it is quite common to create a switch statement that
> doesn't handle all of the possibilities. Why would it be different using
> patterns? Why is it beneficial to force totality? This check can be an IDE
> feature. (Or the programmer can try adding a default clause, and the duplicated
> totality will be shown as an error. But I know it isn't convenient.)

> Honestly I feel that the rule, when the totality is forced, is dictated simply
> by the necessity of backward compatibility. What will happen if a new type (for
> example double) will be allowed for the selector expression? The backward
> compatibility wouldn't be an issue, but it would be weird behaving differently
> with an int compared to a double, so I guess the totality won't be forced. What
> would happen if the finality requirement was removed, and the equality could be
> checked for all types? What about the totality check in this imagined future?

> Of course the best would be to allow the totality check somehow if the intent is
> that. A warning could be a solution, but that can be annoying. It will force
> the programmers to suppress it, or to add a default clause (what is the current
> solution). Adding a default clause could be treated as marking the intent: we
> don't want totality. But the actual syntax does not represent it clearly.

> Additionally, there are issues with the "empty" default clause. In the JEP the
> "default: break;" was recommended, but interestingly it doesn't work with the
> arrow syntax. ("default -> break;" is a compile time error, only the "default:
> {break;}" is possible.) We can use both the "default: {}" and "default -> {}",
> which is fine. But while the "default:" is possible (without body), the
> "default ->" is an error. I don't know what is the reason behind it. Allowing
> an empty body with the arrow syntax would make the actual solution a little bit
> cleaner.

> I think forcing the totality at statements is partly an intent for uniformity.
> The best would be if the same rules were applied to all cases, regardless if it
> is a statement or expression, old or arrow syntax. But backward compatibility
> can't allow full uniformity, it moves the separation into the middle of a box,
> instead of a clear distinction between a statement and expression. It creates
> more cases.

> It would be possible, forcing the totality only at the arrow syntax. But that
> would move some of the programmers using the old syntax, which may be less
> fortunate.

> It would be possible to allow the programmer to mark the intended totality.
> Maybe a new keyword would be too much for this purpose. Alternatively I can
> imagine allowing annotations at a switch statement/expression in the future.
> For example:

> @total
> switch (x) {
> ...
> }

> If this syntax isn't possible, the annotation could be added right before the
> opening curly bracket. Alternatively a warning suppress annotation could be
> used specifically for the switch.
> But this is only a brainstorming. My main point is that I don't think the
> totality should be forced at switch statements. Or it should cause only a
> warning.

> I hope my feedback isn't pointless. I'd be very interested to hear other
> people's opinions.

> Thank you in advance,

> Regards,
> Hunor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20220619/71f32334/attachment-0001.htm>


More information about the amber-dev mailing list