[External] : Re: Remainder in pattern matching

Brian Goetz brian.goetz at oracle.com
Fri Apr 1 13:56:29 UTC 2022


> It seems pretty hard to land anywhere other than where you've landed, 
> for most of this. I have the same sort of question as Dan: do we 
> really want to wrap exceptions thrown by other patterns? You say we 
> want to discourage patterns from throwing at all, and that's a lovely 
> dream, but the behavior of total patterns is to throw when they meet 
> something in their remainder.

Not exactly.  The behavior of *switch* is to throw when they meet 
something in the remainder of *all their patterns*.  For example:

     Box<Box<String>> bbs = new Box(null);
     switch (bbs) {
         case Box(Box(String s)): ...
         case null, Box b: ...
     }

has no remainder and will not throw.  Box(null) doesn't match the first 
pattern, because when we unroll to what amounts to

     if (x instanceof Box alpha && alpha != null && alpha.value() 
instanceof Box beta && beta != null) {
         s = beta.value(); ...
     }
     else if (x == null || x instanceof Box) { ... }

we never dereference something we don't know to be non-null.  So 
Box(null) doesn't match the first case, but the second case gets a shot 
at it.  Only if no case matches does switch throw; *pattern matching* 
should never throw.  (Same story with let, except its like a switch with 
one putatively-exhaustive case.)

> Since user-defined patterns will surely involve primitive patterns at 
> some point, there is the possibility that one of those primitive 
> patterns throws, which bubbles up as an exception thrown by a 
> user-defined pattern.

Again, primitive patterns won't throw, they just won't match.  Under the 
rules I outlined last time, if I have:

     Box<Integer> b = new Box(null);
     switch (b) {
         case Box(int x): ...
         ...
     }

when we try to match Box(int x) to Box(null), it will not NPE, it will 
just not match, and we'll go on to the next case.  If all cases don't 
match, then the switch will throw ME, which is a failure of 
*exhaustiveness*, not a failure in *pattern matching*.

Does this change your first statement?

>
> On Wed, Mar 30, 2022 at 7:40 AM Brian Goetz <brian.goetz at oracle.com> 
> wrote:
>
>     We should have wrapped this up a while ago, so I apologize for the
>     late notice, but we really have to wrap up exceptions thrown from
>     pattern contexts (today, switch) when an exhaustive context
>     encounters a remainder.  I think there's really one one sane
>     choice, and the only thing to discuss is the spelling, but let's
>     go through it.
>
>     In the beginning, nulls were special in switch.  The first thing
>     is to evaluate the switch operand; if it is null, switch threw
>     NPE.  (I don't think this was motivated by any overt null
>     hostility, at least not at first; it came from unboxing, where we
>     said "if its a box, unbox it", and the unboxing throws NPE, and
>     the same treatment was later added to enums (though that came out
>     in the same version) and strings.)
>
>     We have since refined switch so that some switches accept null. 
>     But for those that don't, I see no other move besides "if the
>     operand is null and there is no null handling case, throw NPE." 
>     Null will always be a special remainder value (when it appears in
>     the remainder.)
>
>     In Java 12, when we did switch expressions, we had to confront the
>     issue of novel enum constants.  We considered a number of
>     alternatives, and came up with throwing ICCE.  This was a
>     reasonable choice, though as it turns out is not one that scales
>     as well as we had hoped it would at the time.  The choice here is
>     based on "the view of classfiles at compile time and run time has
>     shifted in an incompatible way."  ICCE is, as Kevin pointed out, a
>     reliable signal that your classpath is borked.
>
>     We now have two precedents from which to extrapolate, but as it
>     turns out, neither is really very good for the general remainder
>     case.
>
>     Recall that we have a definition of _exhaustiveness_, which is, at
>     some level, deliberately not exhaustive. We know that there are
>     edge cases for which it is counterproductive to insist that the
>     user explicitly cover, often for two reasons: one is that its
>     annoying to the user (writing cases for things they believe should
>     never happen), and the other that it undermines type checking (the
>     most common way to do this is a default clause, which can sweep
>     other errors under the rug.)
>
>     If we have an exhaustive set of patterns on a type, the set of
>     possible values for that type that are not covered by some pattern
>     in the set is called the _remainder_.  Computing the remainder
>     exactly is hard, but computing an upper bound on the remainder is
>     pretty easy.  I'll say "x may be in the remainder of P* on T" to
>     indicate that we're defining the upper bound.
>
>      - If P* contains a deconstruction pattern P(Q*), null may be in
>     the remainder of P*.
>      - If T is sealed, instances of a novel subtype of T may be in the
>     remainder of P*.
>      - If T is an enum, novel enum constants of T may be in the
>     remainder of P*.
>      - If R(X x, Y y) is a record, and x is in the remainder of Q* on
>     X, then `R(x, any)` may be in the remainder of { R(q) : q in Q*} on R.
>
>     Examples:
>
>         sealed interface X permits X1, X2 { }
>         record X1(String s) implements X { }
>         record X2(String s) implements X { }
>
>         record R(X x1, X x2) { }
>
>         switch (r) {
>              case R(X1(String s), any):
>              case R(X2(String s), X1(String s)):
>              case R(X2(String s), X2(String s)):
>         }
>
>     This switch is exhaustive.  Let N be a novel subtype of X.  So the
>     remainder includes:
>
>         null, R(N, _), R(_, N), R(null, _), R(X2, null)
>
>     It might be tempting to argue (in fact, someone has) that we
>     should try to pick a "root cause" (null or novel) and throw that. 
>     But I think this is both excessive and unworkable.
>
>     Excessive: This means that the compiler would have to enumerate
>     the remainder set (its a set of patterns, so this is doable) and
>     insert an extra synthetic clause for each.  This is a lot of code
>     footprint and complexity for a questionable benefit, and the sort
>     of place where bugs hide.
>
>     Unworkable: Ultimately such code will have to make an arbitrary
>     choice, because R(N, null) and R(null, N) are in the remainder
>     set.  So which is the root cause?  Null or novel?  We'd have to
>     make an arbitrary choice.
>
>
>     So what I propose is the following simple answer instead:
>
>      - If the switch target is null and no case handles null, throw
>     NPE.  (We know statically whether any case handles null, so this
>     is easy and similar to what we do today.)
>      - If the switch is an exhaustive enum switch, and no case handles
>     the target, throw ICCE.  (Again, we know statically whether the
>     switch is over an enum type.)
>      - In any other case of an exhaustive switch for which no case
>     handles the target, we throw a new exception type,
>     java.lang.MatchException, with an error message indicating remainder.
>
>     The first two rules are basically dictated by compatibility.  In
>     hindsight, we might have not chosen ICCE in 12, and gone with the
>     general (third) rule instead, but that's water under the bridge.
>
>     We need to wrap this up in the next few days, so if you've
>     concerns here, please get them on the record ASAP.
>
>
>     As a separate but not-separate exception problem, we have to deal
>     with at least two additional sources of exceptions:
>
>      - A dtor / record acessor may throw an arbitrary exception in the
>     course of evaluating whether a case matches.
>
>      - User code in the switch may throw an arbitrary exception.
>
>     For the latter, this has always been handled by having the switch
>     terminate abruptly with the same exception, and we should continue
>     to do this.
>
>     For the former, we surely do not want to swallow this exception
>     (such an exception indicates a bug).  The choices here are to
>     treat this the same way we do with user code, throwing it out of
>     the switch, or to wrap with MatchException.
>
>     I prefer the latter -- wrapping with MatchException -- because the
>     exception is thrown from synthetic code between the user code and
>     the ultimate thrower, which means the pattern matching feature is
>     mediating access to the thrower.  I think we should handle this as
>     "if a pattern invoked from pattern matching completes abruptly by
>     throwing X, pattern matching completes abruptly with
>     MatchException", because the specific X is not a detail we want
>     the user to bind to.  (We don't want them to bind to anything, but
>     if they do, we want them to bind to the logical action, not the
>     implementation details.)
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20220401/71dcb257/attachment-0001.htm>


More information about the amber-spec-experts mailing list