switch: using an expicit type as total is dangerous
forax at univ-mlv.fr
forax at univ-mlv.fr
Sun Aug 30 22:48:25 UTC 2020
----- Mail original -----
> De: "Brian Goetz" <brian.goetz at oracle.com>
> À: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Envoyé: Dimanche 30 Août 2020 17:57:15
> Objet: Re: switch: using an expicit type as total is dangerous
> One of the things that I think is being missed here is that switch is turning
> into a much more powerful construct, and as a result, our intuition for what it
> means needs to change with it. All of the sky-is-falling hyperbole about
> action-at-a-distance comes, in no small part, from having not fully upgraded
> our mental models to understand what switch means. Of course, we want to
> strike a balance between preserving suitable existing intuitions and extending
> he power of the construct, but let’s not fool ourselves that there isn’t
> something new to learn here. But given tradeoffs where we damage the
> expressiveness or consistency or clarity of the new construct to prop up old,
> obsolete intuitions, there’s really no decision here.
>
> Legacy switches have only constants as their cases. This means that at most one
> case could match, and in turn this means that order is independent. They
> implement the equivalent of a dynamically-computed (possibly partial) map.
>
> New switches are like constrained if-else chains. The constraints are useful to
> the compiler because they are more easily scrutable; compilers can optimize
> away redundant tests, and turn some O(n) switches into O(log n) or O(1). And
> they are useful to the user because they are asking a simpler question which
> can have a less error-prone expression. But at root, some things have changed,
> such as non-overlap of cases, and they can have far more interesting structure.
> And these structures are not single-level, they are recursive (we will see
> chains of D(P) … D(Q) just as we will see chains of P .. Q), and our model
> should embrace that natural structure.
>
> While it may feel unfamiliar today, we should expect to routinely recognize the
> shape of:
>
> case Foo(Bar x):
> case Foo(Baz x):
> case Foo f: // or case Foo(Object o):
> case Object o: // or default
>
> where the cases form a tree, because these shapes will be ubiquitous. And
> developers will develop intuitions about these kinds of shapes too, but right
> now, they look a little more foreign, and so we’re engaging in “bargaining” to
> try to make their behaviors less strange.
It does not feel strange or new to me
- an 'or' pattern on a sealed type is equivalent to a legacy switch where the order is independent
- an 'or' pattern on random types is equivalent to a series of catches where the order matters
How to interact with null may be a little strange but one you peek that a total case (or default) is like "else", it makes sense.
The less obvious part is just how to know if a pattern is total or not (apart from case Object where it's obvious).
>
> So, rather than trying to constrain the model to be more consistent with what
> we’re familiar with, we should be trying to get familiar with the shapes that
> will naturally arise (which is not hard, because we can look in code from other
> languages that have similar features) before we start to make claims about what
> is “natural” or “confusing.” Otherwise, we’re just starting the race with
> bricks tied to our feet.
Rémi
>
>
>> On Aug 30, 2020, at 11:07 AM, Brian Goetz <brian.goetz at oracle.com> wrote:
>>
>> Sorry, but I don’t find this example that compelling either. And I certainly
>> don’t think it comes remotely close to “bad enough that we have to throw the
>> design in the trash.”
>>
>> As I said from the beginning, yes, you can construct puzzlers. But the
>> existence of a puzzler is not necessarily evidence that the language design is
>> broken, and you are lobbying (implicitly) for an extreme solution: to take the
>> entire design that we’ve worked on for years and toss it in the trash, and
>> replace it with something that is not even yet a proposal, I think the bar is
>> much higher than this. (I don’t think that’s an exaggeration; totality is what
>> makes the entire design stick together without being an ad-hoc bag of “but on
>> tuesday, do it differently”.)
>>
>> This example feels to me in the same category as combining var with diamond.
>> There’s nothing wrong with that, but by leaving so many things implicit in
>> your program, you may get an inference that is not what you expected. That
>> doesn’t mean that var is bad, or diamond is bad, or even that we should outlaw
>> their interaction (which some suggested at the time.) It just means that when
>> you combine features, especially features that involve implicitness, your
>> program becomes more brittle.
>>
>> This example combines a lot of implicitness to get the same kind of brittleness,
>> including switching on a complex expression whose type isn’t obvious, and, more
>> importantly, making an incompatible change to a method. I don’t think you get
>> to lay the blame on the language inferring totality here, unless you’re
>> advocating that we should never infer anything! Making a change like this
>> could easily change inferred types, which could silently affect overload
>> decisions, and, when we get to Valhalla, even runtime layouts. That’s just
>> part of the trade we make when we allow users to leave something unspecified,
>> whether it be a manifest type (var, diamond, generic method witnesses),
>> finality (lambda capture), totality (as here), accessibility (such as when
>> migrating a class to an interface), etc.
>>
>> So, it’s a good example, to call our attention to the consequences of leaving
>> totality implicit. (We’re having a separate discussion about whether to let
>> the user opt into making totality explicit, and that’s another tool that could
>> be used to make this example less brittle, just as manifest types would make it
>> less brittle than switching on an expression.)
>>
>> Really, though, I think you’re attacking totality because of .01% imperfections
>> without really appreciating how much worse the alternatives are, and how much
>> more often their pain points would come up. (Refactoring switches to
>> instanceof should be expected to happen 1000x more often than making an
>> incompatible change to a method signature and hoping nothing changes.) It’s
>> good to identify the warts, but I’d prefer a little less jumping from “wart,
>> ergo mistake” — it took us three years to converge on this answer precisely
>> because there are no perfect answers.
>>
>>
>>> On Aug 30, 2020, at 7:37 AM, forax at univ-mlv.fr wrote:
>>>
>>> ----- Mail original -----
>>>> De: "Brian Goetz" <brian.goetz at oracle.com>
>>>> À: "Remi Forax" <forax at univ-mlv.fr>, "amber-spec-experts"
>>>> <amber-spec-experts at openjdk.java.net>
>>>> Envoyé: Lundi 24 Août 2020 20:57:03
>>>> Objet: Re: switch: using an expicit type as total is dangerous
>>>
>>>>> 2/ using an explicit type for a total type is a footgun because the semantics
>>>>> will change if the hierarchy or the return type of a method switched upon
>>>>> change.
>>>>
>>>> Sorry, I think this argument is a pure red herring. I get why this is
>>>> one of those "scary the first time you see it" issues, but I think the
>>>> fear has been overblown to near-panic proportions. We've spent a lot of
>>>> time talking about it and, the more we talk, the less worried I am.
>>>
>>> good for you,
>>> the more i talk about it, the more i'm worried because you don't seem to
>>> understand that having the semantics that change underneath you is bad.
>>>
>>>>
>>>> The conditions that have to combine for this to happen are already
>>>> individually rare:
>>>> - a hierarchy change, combined with
>>>> - enough use-site type inference that is not obvious what the type
>>>> dependencies are, combined with
>>>> - null actually being a member of the domain, combined with
>>>> - users not realizing null is a member of the domain.
>>>
>>>
>>> nope, you don't need a hierarchy change, changing the return type (as noticed by
>>> Tagir) and null being part of the domain is enough.
>>>
>>>>
>>>> Then, for it to actually be a problem, not only do all of the above have
>>>> to happen, but an unhandled null has to actually show up.
>>>>
>>>> Even then, the severity of this case is low -- most likely, the NPE gets
>>>> moved from one place to another.
>>>
>>> nope see below
>>>
>>>>
>>>> Even then, the remediation cost is trivial.
>>>
>>> for having remediation, as a user you have to first see the change of semantics,
>>> but you don't.
>>>
>>>
>>> Ok, let's take an example, i've written a method getLiteral()
>>> Number getLiteral(String token) {
>>> if (token.equals("null")) {
>>> return null; // null is part of the domain
>>> }
>>> try {
>>> return Integer.parseInt(token);
>>> } catch(NumberFormatException e) {
>>> return Double.parseDouble(token);
>>> }
>>> }
>>>
>>> and a statement switch in another package/module
>>> switch(getLiteral(token)) {
>>> case Integer -> System.out.println("Integer");
>>> case Double -> System.out.println("Double");
>>> case Number -> System.out.println("null");
>>> }
>>>
>>> but now i change getLiteral() to add string literal
>>> Object getLiteral(String token) {
>>> if (token.equals("null")) {
>>> return null; // null is part of the domain
>>> }
>>> if (token.startsWith("\"") {
>>> return token.substring(1, token.length() - 1);
>>> }
>>> try {
>>> return Integer.parseInt(token);
>>> } catch(NumberFormatException e) {
>>> return Double.parseDouble(token);
>>> }
>>> }
>>>
>>> If i only recompile getLiteral(), and run the code containing the switch, i get
>>> a ICCE at runtime because the signature of getLiteral() has changed, which is
>>> good,
>>> but if i now recompile the switch, the code compiles without any error but with
>>> a different semantics, duh ?
>>>
>>> Using "case var _" as the last case at least keep the same semantics, using
>>> "default Number" does not compile.
>>>
>>> [...]
>>>
>>> Rémi
More information about the amber-spec-experts
mailing list