Patterns and nulls

John Rose john.r.rose at oracle.com
Fri Mar 16 01:59:19 UTC 2018


On Mar 15, 2018, at 5:01 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
> 
> For the exhaustiveness-lovers, they can manually enumerate all the cases (enums, sealed type members), and have a throwing default which will detect unexpected targets that would be impossible at compile time.  (At some point, maybe we'll help them by adding support for "default: unreachable", which would provide not only runtime detection but would enlist the compiler's flow analysis as well.)

I'm one of those exhaustiveness lovers, because I'm afraid of accidental fallout,
which is what happens in today's switches when a surprise value comes through.
I'm happy that expression switches will exclude fallout robustly.

I'm also content that I can insert a throw in my statement switch to robustly
exclude fallout, and happy that this throw is likely to be a simple notation,
probably "default: throw;".

Here are some more details about this:

This use case implies a peculiar thing about the "throwing default":
The compiler *may* be able to prove that the inserted throw is unreachable.
It must not complain about it, however, since I have a legitimate
need to mark my switch as protected against future accidents.
In fact, javac treats enum switches with such open-mindedness.

In the case of enums, the future accident would be a novel enum
value coming through, after a separate recompilation introduced it.
(Think of somebody adding Yule or Duodecember to enum Month.)
As an exhaustiveness lover, I use a "throwing default" to button
up my switch against such future novelties.

The use case also implies that the throw statement must not be
just any old thing (throw new AssertionError("oops")), but should
align with the language's internal story of how to deal with
these problems in expression switches.  The notation should
also be shorter than a normal throw expression, or programmers
will find it hard to write and read, and be tempted to leave it out
even when it would make the code more robust.

Also, it has to be a throw of some sort, because it cannot allow
execution to continue after it, lest the compiler complain about
fallout from an enclosing method body or a path by which a
blank local is left unassigned.  I.e., the notation needs a special
pass from the reachability police.

This leads us to the following syntax, or one like it:

switch ((ColorChannel)c) {
case R: return red();
case G: return green();
case B: return blue();
default: throw;  //not reached, no fallout ever
}

This would be a the refactoring of the corresponding
expression switch:

return switch ((ColorChannel)c) {
case R-> red();
case G-> green();
case B-> blue();
//optional, can omit:  default: throw;
};

There's a trick where Java programmers can lean on DA
rules to protect against fallout.  Here's a third refactoring
of those switches which uses this trick:

int result;
switch ((ColorChannel)c) {
case R: result = red(); break;
case G: result = green(); break;
case B: result = blue(); break;
default: throw;  //not reached, no fallout ever
}
return result;

(A very subtle bug can arise if result has an initializer,
say "result=null", or if it is an object field, and there
is no default.  Then fallout is probably unexpected, and
the user could be in trouble if a novel enum shows up in
the future.  Most coders want to avoid such traps if they
can, and the language should help.  That's another reason
for a concise "default: throw" notation.  The DA tricks
and the tracking of live code paths, don't always diagnose
unexpected fallout.)

The compiler can and should choose the same error path
for all these switches, one which will diagnose the problem
adequately, and lead to suitable remedies like recompilation.

The user should not be expected to compose an adequate
exception message for responding to this corner case.
(Analogy:  What if we required every division statement
to be accompanied by an exception message to use
when a zero divisor occurred?)

As a shortcut, programmers often use another trick, which
inserts "default:" before the case label of one of the enum
values.  This relieves the programmer of concocting an
exception expression, and is probably the easy way out
that we take most often, but it is not always a robust
answer.  If Yule shows up, he'll be treated just like
December, or whatever random month the programmer
stuck the default label on to placate the reachability
police.  It would be better if the throw were easy to
write and read; Yule would be welcomed in with the
appropriate diagnostic rather than a silent miscalculation.

Of course many switches are not exhaustive, and users
*expect* fallout to happen.  Leaving out the "default" selects
this behavior (except for null, but then there's "case null:break").

If the switch looks exhaustive, there might be some doubt about
whether the programmer intended exhaustiveness.  In such situations
a compiler warning might be helpful, and an IDE intention would
certainly be helpful.  The switch can be disambiguated by adding
either "default: throw" or "default: break" (the latter confirming the
implicit fallout).

The postures towards exhaustiveness and nulls are independent
and can be dealt with in detail with other constructs.  Here are
the use cases:

default: break;  //NPE else fallout: legacy behavior
/*nothing*/   // same as "default: break" but did you really mean it?

default: throw;  //never fallout; this is built into expression switches

case Object: break;  //fallout includes null
case null: break;  //same, but are you expecting exhaustiveness?

case null: break; default: throw;  //fallout on null but otherwise exhaustive

Different styles of coding will use different formulations.

Finally, I want to point out that there are two good reasons for thinking
hard about exhaustiveness at this point, and why we can't just postpone
it for future work:

1. Expression switches *must* be exhaustive, so we need to define
all the checks, translation strategies, and runtime exceptions that
are entailed.  It's a reasonable consistency play to cross-apply
the relevant goodies to statement switches.

2. We are greatly complicating the sub-language of case labels,
and we expect to add new kinds of "exhaustible" types to switches.
In the past we expected users to reason about fallout behaviors
by inspecting the switch cases and making reasonable conclusions
about need for a default.  That will become harder to do as case
labels become overlapping and more intertwined with the type
system.  It's a good time to define an easy-to-use "seat belt"
(or lead apron?) to protect against unexpected fallout.

— John

P.S. I found a discussion about exhaustive enum switches here:

https://stackoverflow.com/questions/5013194/why-is-default-required-for-a-switch-on-an-enum-in-this-code <https://stackoverflow.com/questions/5013194/why-is-default-required-for-a-switch-on-an-enum-in-this-code>

It discusses the need to put a "default: throw" in such switches,
since the language currently does not observe that enum is an
exhaustible type.

JLS 14.11 says:  "A Java compiler is encouraged (but not
required) to provide a warning if a switch on an enum-valued
expression lacks a default label and lacks case labels for one
or more of the enum's constants. Such a switch will silently do
nothing if the expression evaluates to one of the missing constants."

That is one kind of linty warning that might help.  But a more
subtle one would be, "your switch looks exhaustive, but
you are not protecting against future novelty values."
Fixing that warning would prevent some really subtle
bugs, and that's a good job for "default: throw".

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20180315/ebb06963/attachment-0001.html>


More information about the amber-spec-experts mailing list