Final variable initialization problem with exhaustive switch

Brian Goetz brian.goetz at oracle.com
Tue Nov 30 15:39:05 UTC 2021


This is a sensible-seeming and well-intentioned argument, but it is unfortunately not correct.  

The problem is one of separate compilation.  At compile time, there are only two enum constants, EVEN and ODD, but at run time, there may be more (someone could edit Parity to add INDETERMINATE, and not recompile the client code.)  If such a novel constant shows up at runtime, the program semantics are subverted.  

Combine this with the fact that historically, switch statements are inherently partial; the switch you wrote currently *has a meaning*, which is if a novel value shows up, do neither of the actions.  This might be a “dumb” meaning, but it is the meaning existing switches have, and we can’t arbitrarily change it.  

What you are arguing is that we made bad decisions in 1995 by allowing partial switches at all; while I might agree with that, in hindsight, it was “obvious” in 1995 that partiality was an inherent aspect of switch statements (because, C.)  When we added switch *expressions*, totality was a forced move (and a good one); when we extended switches to support patterns, we made the bold move (which will annoy some) to require that non-legacy switch STATEMENTS also be total.  But we can’t rewrite history.  

We toyed with the idea of having a modifier on switch (e.g. “total-switch”) which would engage the totality checking, but the return here seemed weak; it would only work if people changed their code.   We could over a decade possibly nudge pattern switches towards totality, with a gradually increasing set of warnings which eventually became errors, and we might do that, but let’s not kid ourselves that this is a very long road.  

We are largely in a no-win situation; we can either have newer constructs inherit the mistakes of the past, or live with some seam of “old” switches working slightly differently.  We can try to move that seam around, but its hard to have it all.  Your claim of “inconsistency” illustrates that fact, but if you had consistency, you wouldn’t like it, because we’d have to be consistent with the bad old behavior.  

Of course, whether this is “too” safe is a matter of opinion.  It is a popular viewpoint to say “but its wrong, just go ahead and break other people’s code, its good for them.”  But people won’t thank you for educating them; not doing so (or doing so only very, very carefully) is how Java stays popular.  So while I might call it unfortunate, it’s not “too” safe; its just safe.  

> On Nov 30, 2021, at 7:49 AM, Dimitris Paltatzidis <dcrystalmails at gmail.com> wrote:
> 
>> 
>> seems like
>> it should be a bug, right?  But it’s not.
>> 
> 
> Agree, it is just playing it too safe.
> 
> We can translate the above switch statement into an if, without losing the
> semantics:
> 
> if (p == Parity.ODD) {
>    a = 1;
> } else if (p == Parity.EVEN) {
>    a = 0;
> }
> int b = a + 1; //Compile time error: variable a might not have been
> initialized
> 
> The above if is exhaustive, yet no one expects last statement to compile,
> and that is totally fine.
> So, reducing the switch to the above if, makes it clear why the compiler
> can't guarantee about the switch too.
> But, maybe it should. You see, to let the above if compile is an overkill,
> but with the switch, we have the notion
> of its context, and especially with enums and sealed types, we can take it
> a step further and make assumptions
> about its exhaustiveness.
> 
> The switch expression guarantees initialization of final variables, based
> on whether it compiles or not, which by
> itself depends on its exhaustiveness.
> The switch statement does not have to be exhaustive to compile and
> therefore there isn't here that compilation
> barrier that guarantees the initialization. But, that doesn't have to be
> the case. Our switch statement in question
> is *Effectively *exhaustive. So, the compiler doesn't need to prove that a
> switch statement is exhaustive, it is
> sufficient to just prove that it is effectively exhaustive. It is similar
> to the effectively final variables we have with
> lambdas.
> 
> Of Course, at the end of the day, it's not about exhaustiveness, it's about
> having a branch that is guaranteed
> to perform the initialization. That is what we are effectively trying to
> prove.
> 
> Lately, the switch has been greatly enhanced, especially in the
> exhaustiveness department. Getting the
> switch statement on track in that aspect, seems more consistent.
> 
> Στις Τρί, 30 Νοε 2021 στις 9:20 π.μ., ο/η John Rose <john.r.rose at oracle.com>
> έγραψε:
> 
>> On Nov 29, 2021, at 12:04 PM, Dimitris Paltatzidis <
>> dcrystalmails at gmail.com> wrote:
>> 
>> 
>> How can the compiler prove that the final variable will be initialized only
>> in the second case and not in the first too?
>> 
>> 
>> That differing treatment of definite assignment seems like
>> it should be a bug, right?  But it’s not.  In fact, your switch
>> expression can (under certain circumstances) execute
>> correctly without executing either of its case branches.
>> 
>> Switch expressions are inherently exhaustive across their cases.
>> Switch statements are not.  This is a key difference.
>> 
>> You can see this if you take your code and comment out
>> one of the cases, say for Parity.EVEN.  The switch expression
>> errors out at compile time, but the switch statement is a
>> happy camper.
>> 
>> Or, if someone comes along and adds a third member to Parity
>> (say, NEITHER) then your switch expression will either fail to
>> compile statically or (if not recompiled) will fail to terminate
>> normally at runtime (throwing something).
>> 
>> Meanwhile, the corresponding switch statement will happily
>> recompile, and (whether recompiled or not) will fall through
>> when Parity.NEITHER is presented.  In that case, the variable
>> in question will not have been initialized.
>> 
>> This is a problem with legacy switch statements.  I’m not sure
>> what the solution will be, although I know various alternatives
>> have been discussed at some point.  The problem is that the
>> language cannot read the user’s mind, when it finds that
>> (a) an enum is the subject of a switch statement but (b) not
>> all members of the enum are mentioned in the cases.  Is it
>> an error in a switch that was intended to be exhaustive?
>> Or is it just that the user (that lazy user!) didn’t want to
>> mention cases which needed no-op actions?
>> 
>> — John
>> 



More information about the amber-dev mailing list