[External] : Re: Remainder in pattern matching

Thu Apr 7 21:40:20 UTC 2022

Yes, this clears up my concerns.

On Fri, Apr 1, 2022 at 6:56 AM Brian Goetz <brian.goetz at oracle.com> wrote:

>
> It seems pretty hard to land anywhere other than where you've landed, for
> most of this. I have the same sort of question as Dan: do we really want to
> wrap exceptions thrown by other patterns? You say we want to discourage
> patterns from throwing at all, and that's a lovely dream, but the behavior
> of total patterns is to throw when they meet something in their remainder.
>
>
> Not exactly.  The behavior of *switch* is to throw when they meet
> something in the remainder of *all their patterns*.  For example:
>
>     Box<Box<String>> bbs = new Box(null);
>     switch (bbs) {
>         case Box(Box(String s)): ...
>         case null, Box b: ...
>     }
>
> has no remainder and will not throw.  Box(null) doesn't match the first
> pattern, because when we unroll to what amounts to
>
>     if (x instanceof Box alpha && alpha != null && alpha.value()
> instanceof Box beta && beta != null) {
>         s = beta.value(); ...
>     }
>     else if (x == null || x instanceof Box) { ... }
>
> we never dereference something we don't know to be non-null.  So Box(null)
> doesn't match the first case, but the second case gets a shot at it.  Only
> if no case matches does switch throw; *pattern matching* should never
> throw.  (Same story with let, except its like a switch with one
> putatively-exhaustive case.)
>
> Since user-defined patterns will surely involve primitive patterns at some
> point, there is the possibility that one of those primitive patterns
> throws, which bubbles up as an exception thrown by a user-defined pattern.
>
>
> Again, primitive patterns won't throw, they just won't match.  Under the
> rules I outlined last time, if I have:
>
>     Box<Integer> b = new Box(null);
>     switch (b) {
>         case Box(int x): ...
>         ...
>     }
>
> when we try to match Box(int x) to Box(null), it will not NPE, it will
> just not match, and we'll go on to the next case.  If all cases don't
> match, then the switch will throw ME, which is a failure of
> *exhaustiveness*, not a failure in *pattern matching*.
>
> Does this change your first statement?
>
>
> On Wed, Mar 30, 2022 at 7:40 AM Brian Goetz <brian.goetz at oracle.com>
> wrote:
>
>> We should have wrapped this up a while ago, so I apologize for the late
>> notice, but we really have to wrap up exceptions thrown from pattern
>> contexts (today, switch) when an exhaustive context encounters a
>> remainder.  I think there's really one one sane choice, and the only thing
>> to discuss is the spelling, but let's go through it.
>>
>> In the beginning, nulls were special in switch.  The first thing is to
>> evaluate the switch operand; if it is null, switch threw NPE.  (I don't
>> think this was motivated by any overt null hostility, at least not at
>> first; it came from unboxing, where we said "if its a box, unbox it", and
>> the unboxing throws NPE, and the same treatment was later added to enums
>> (though that came out in the same version) and strings.)
>>
>> We have since refined switch so that some switches accept null.  But for
>> those that don't, I see no other move besides "if the operand is null and
>> there is no null handling case, throw NPE."  Null will always be a special
>> remainder value (when it appears in the remainder.)
>>
>> In Java 12, when we did switch expressions, we had to confront the issue
>> of novel enum constants.  We considered a number of alternatives, and came
>> up with throwing ICCE.  This was a reasonable choice, though as it turns
>> out is not one that scales as well as we had hoped it would at the time.
>> The choice here is based on "the view of classfiles at compile time and run
>> time has shifted in an incompatible way."  ICCE is, as Kevin pointed out, a
>> reliable signal that your classpath is borked.
>>
>> We now have two precedents from which to extrapolate, but as it turns
>> out, neither is really very good for the general remainder case.
>>
>> Recall that we have a definition of _exhaustiveness_, which is, at some
>> level, deliberately not exhaustive.  We know that there are edge cases for
>> which it is counterproductive to insist that the user explicitly cover,
>> often for two reasons: one is that its annoying to the user (writing cases
>> for things they believe should never happen), and the other that it
>> undermines type checking (the most common way to do this is a default
>> clause, which can sweep other errors under the rug.)
>>
>> If we have an exhaustive set of patterns on a type, the set of possible
>> values for that type that are not covered by some pattern in the set is
>> called the _remainder_.  Computing the remainder exactly is hard, but
>> computing an upper bound on the remainder is pretty easy.  I'll say "x may
>> be in the remainder of P* on T" to indicate that we're defining the upper
>> bound.
>>
>>  - If P* contains a deconstruction pattern P(Q*), null may be in the
>> remainder of P*.
>>  - If T is sealed, instances of a novel subtype of T may be in the
>> remainder of P*.
>>  - If T is an enum, novel enum constants of T may be in the remainder of
>> P*.
>>  - If R(X x, Y y) is a record, and x is in the remainder of Q* on X, then
>> `R(x, any)` may be in the remainder of { R(q) : q in Q*} on R.
>>
>> Examples:
>>
>>     sealed interface X permits X1, X2 { }
>>     record X1(String s) implements X { }
>>     record X2(String s) implements X { }
>>
>>     record R(X x1, X x2) { }
>>
>>     switch (r) {
>>          case R(X1(String s), any):
>>          case R(X2(String s), X1(String s)):
>>          case R(X2(String s), X2(String s)):
>>     }
>>
>> This switch is exhaustive.  Let N be a novel subtype of X.  So the
>> remainder includes:
>>
>>     null, R(N, _), R(_, N), R(null, _), R(X2, null)
>>
>> It might be tempting to argue (in fact, someone has) that we should try
>> to pick a "root cause" (null or novel) and throw that.  But I think this is
>> both excessive and unworkable.
>>
>> Excessive: This means that the compiler would have to enumerate the
>> remainder set (its a set of patterns, so this is doable) and insert an
>> extra synthetic clause for each.  This is a lot of code footprint and
>> complexity for a questionable benefit, and the sort of place where bugs
>> hide.
>>
>> Unworkable: Ultimately such code will have to make an arbitrary choice,
>> because R(N, null) and R(null, N) are in the remainder set.  So which is
>> the root cause?  Null or novel?  We'd have to make an arbitrary choice.
>>
>>
>> So what I propose is the following simple answer instead:
>>
>>  - If the switch target is null and no case handles null, throw NPE.  (We
>> know statically whether any case handles null, so this is easy and similar
>> to what we do today.)
>>  - If the switch is an exhaustive enum switch, and no case handles the
>> target, throw ICCE.  (Again, we know statically whether the switch is over
>> an enum type.)
>>  - In any other case of an exhaustive switch for which no case handles
>> the target, we throw a new exception type, java.lang.MatchException, with
>> an error message indicating remainder.
>>
>> The first two rules are basically dictated by compatibility.  In
>> hindsight, we might have not chosen ICCE in 12, and gone with the general
>> (third) rule instead, but that's water under the bridge.
>>
>> We need to wrap this up in the next few days, so if you've concerns here,
>> please get them on the record ASAP.
>>
>>
>> As a separate but not-separate exception problem, we have to deal with at
>> least two additional sources of exceptions:
>>
>>  - A dtor / record acessor may throw an arbitrary exception in the course
>> of evaluating whether a case matches.
>>
>>  - User code in the switch may throw an arbitrary exception.
>>
>> For the latter, this has always been handled by having the switch
>> terminate abruptly with the same exception, and we should continue to do
>> this.
>>
>> For the former, we surely do not want to swallow this exception (such an
>> exception indicates a bug).  The choices here are to treat this the same
>> way we do with user code, throwing it out of the switch, or to wrap with
>> MatchException.
>>
>> I prefer the latter -- wrapping with MatchException -- because the
>> exception is thrown from synthetic code between the user code and the
>> ultimate thrower, which means the pattern matching feature is mediating
>> access to the thrower.  I think we should handle this as "if a pattern
>> invoked from pattern matching completes abruptly by throwing X, pattern
>> matching completes abruptly with MatchException", because the specific X is
>> not a detail we want the user to bind to.  (We don't want them to bind to
>> anything, but if they do, we want them to bind to the logical action, not
>> the implementation details.)
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20220407/83c47847/attachment.htm>