Final variable initialization problem with exhaustive switch

Tue Nov 30 16:01:34 UTC 2021

----- Original Message -----
> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Dimitris Paltatzidis" <dcrystalmails at gmail.com>
> Cc: "John Rose" <john.r.rose at oracle.com>, "amber-dev" <amber-dev at openjdk.java.net>
> Sent: Mardi 30 Novembre 2021 16:39:05
> Subject: Re: Final variable initialization problem with exhaustive switch

> This is a sensible-seeming and well-intentioned argument, but it is
> unfortunately not correct.
> 
> The problem is one of separate compilation.  At compile time, there are only two
> enum constants, EVEN and ODD, but at run time, there may be more (someone could
> edit Parity to add INDETERMINATE, and not recompile the client code.)  If such
> a novel constant shows up at runtime, the program semantics are subverted.
> 
> Combine this with the fact that historically, switch statements are inherently
> partial; the switch you wrote currently *has a meaning*, which is if a novel
> value shows up, do neither of the actions.  This might be a “dumb” meaning, but
> it is the meaning existing switches have, and we can’t arbitrarily change it.
> 
> What you are arguing is that we made bad decisions in 1995 by allowing partial
> switches at all; while I might agree with that, in hindsight, it was “obvious”
> in 1995 that partiality was an inherent aspect of switch statements (because,
> C.)  When we added switch *expressions*, totality was a forced move (and a good
> one); when we extended switches to support patterns, we made the bold move
> (which will annoy some) to require that non-legacy switch STATEMENTS also be
> total.  But we can’t rewrite history.
> 
> We toyed with the idea of having a modifier on switch (e.g. “total-switch”)
> which would engage the totality checking, but the return here seemed weak; it
> would only work if people changed their code.   We could over a decade possibly
> nudge pattern switches towards totality, with a gradually increasing set of
> warnings which eventually became errors, and we might do that, but let’s not
> kid ourselves that this is a very long road.

It's a long road but any stations along that road is a better place than the one we are currently.

Dimitris send me the code privately before and ask me if it was a bug or not.
I answer that it was a bug, I was wrong. I'm pretty sure Dimitris and me are not the only ones to fall into that trap.

Adding a warning requiring a default might be a good step forward, even if we never do more than that.

> 
> We are largely in a no-win situation; we can either have newer constructs
> inherit the mistakes of the past, or live with some seam of “old” switches
> working slightly differently.  We can try to move that seam around, but its
> hard to have it all.  Your claim of “inconsistency” illustrates that fact, but
> if you had consistency, you wouldn’t like it, because we’d have to be
> consistent with the bad old behavior.
> 
> Of course, whether this is “too” safe is a matter of opinion.  It is a popular
> viewpoint to say “but its wrong, just go ahead and break other people’s code,
> its good for them.”  But people won’t thank you for educating them; not doing
> so (or doing so only very, very carefully) is how Java stays popular.  So while
> I might call it unfortunate, it’s not “too” safe; its just safe.

Fortunately, a warning does not break the code.

Rémi

> 
>> On Nov 30, 2021, at 7:49 AM, Dimitris Paltatzidis <dcrystalmails at gmail.com>
>> wrote:
>> 
>>> 
>>> seems like
>>> it should be a bug, right?  But it’s not.
>>> 
>> 
>> Agree, it is just playing it too safe.
>> 
>> We can translate the above switch statement into an if, without losing the
>> semantics:
>> 
>> if (p == Parity.ODD) {
>>    a = 1;
>> } else if (p == Parity.EVEN) {
>>    a = 0;
>> }
>> int b = a + 1; //Compile time error: variable a might not have been
>> initialized
>> 
>> The above if is exhaustive, yet no one expects last statement to compile,
>> and that is totally fine.
>> So, reducing the switch to the above if, makes it clear why the compiler
>> can't guarantee about the switch too.
>> But, maybe it should. You see, to let the above if compile is an overkill,
>> but with the switch, we have the notion
>> of its context, and especially with enums and sealed types, we can take it
>> a step further and make assumptions
>> about its exhaustiveness.
>> 
>> The switch expression guarantees initialization of final variables, based
>> on whether it compiles or not, which by
>> itself depends on its exhaustiveness.
>> The switch statement does not have to be exhaustive to compile and
>> therefore there isn't here that compilation
>> barrier that guarantees the initialization. But, that doesn't have to be
>> the case. Our switch statement in question
>> is *Effectively *exhaustive. So, the compiler doesn't need to prove that a
>> switch statement is exhaustive, it is
>> sufficient to just prove that it is effectively exhaustive. It is similar
>> to the effectively final variables we have with
>> lambdas.
>> 
>> Of Course, at the end of the day, it's not about exhaustiveness, it's about
>> having a branch that is guaranteed
>> to perform the initialization. That is what we are effectively trying to
>> prove.
>> 
>> Lately, the switch has been greatly enhanced, especially in the
>> exhaustiveness department. Getting the
>> switch statement on track in that aspect, seems more consistent.
>> 
>> Στις Τρί, 30 Νοε 2021 στις 9:20 π.μ., ο/η John Rose <john.r.rose at oracle.com>
>> έγραψε:
>> 
>>> On Nov 29, 2021, at 12:04 PM, Dimitris Paltatzidis <
>>> dcrystalmails at gmail.com> wrote:
>>> 
>>> 
>>> How can the compiler prove that the final variable will be initialized only
>>> in the second case and not in the first too?
>>> 
>>> 
>>> That differing treatment of definite assignment seems like
>>> it should be a bug, right?  But it’s not.  In fact, your switch
>>> expression can (under certain circumstances) execute
>>> correctly without executing either of its case branches.
>>> 
>>> Switch expressions are inherently exhaustive across their cases.
>>> Switch statements are not.  This is a key difference.
>>> 
>>> You can see this if you take your code and comment out
>>> one of the cases, say for Parity.EVEN.  The switch expression
>>> errors out at compile time, but the switch statement is a
>>> happy camper.
>>> 
>>> Or, if someone comes along and adds a third member to Parity
>>> (say, NEITHER) then your switch expression will either fail to
>>> compile statically or (if not recompiled) will fail to terminate
>>> normally at runtime (throwing something).
>>> 
>>> Meanwhile, the corresponding switch statement will happily
>>> recompile, and (whether recompiled or not) will fall through
>>> when Parity.NEITHER is presented.  In that case, the variable
>>> in question will not have been initialized.
>>> 
>>> This is a problem with legacy switch statements.  I’m not sure
>>> what the solution will be, although I know various alternatives
>>> have been discussed at some point.  The problem is that the
>>> language cannot read the user’s mind, when it finds that
>>> (a) an enum is the subject of a switch statement but (b) not
>>> all members of the enum are mentioned in the cases.  Is it
>>> an error in a switch that was intended to be exhaustive?
>>> Or is it just that the user (that lazy user!) didn’t want to
>>> mention cases which needed no-op actions?
>>> 
>>> — John