Remainder in pattern matching

Fri Apr 1 00:54:08 UTC 2022

It seems pretty hard to land anywhere other than where you've landed, for
most of this. I have the same sort of question as Dan: do we really want to
wrap exceptions thrown by other patterns? You say we want to discourage
patterns from throwing at all, and that's a lovely dream, but the behavior
of total patterns is to throw when they meet something in their remainder.
Since user-defined patterns will surely involve primitive patterns at some
point, there is the possibility that one of those primitive patterns
throws, which bubbles up as an exception thrown by a user-defined pattern.

On Wed, Mar 30, 2022 at 7:40 AM Brian Goetz <brian.goetz at oracle.com> wrote:

> We should have wrapped this up a while ago, so I apologize for the late
> notice, but we really have to wrap up exceptions thrown from pattern
> contexts (today, switch) when an exhaustive context encounters a
> remainder.  I think there's really one one sane choice, and the only thing
> to discuss is the spelling, but let's go through it.
>
> In the beginning, nulls were special in switch.  The first thing is to
> evaluate the switch operand; if it is null, switch threw NPE.  (I don't
> think this was motivated by any overt null hostility, at least not at
> first; it came from unboxing, where we said "if its a box, unbox it", and
> the unboxing throws NPE, and the same treatment was later added to enums
> (though that came out in the same version) and strings.)
>
> We have since refined switch so that some switches accept null.  But for
> those that don't, I see no other move besides "if the operand is null and
> there is no null handling case, throw NPE."  Null will always be a special
> remainder value (when it appears in the remainder.)
>
> In Java 12, when we did switch expressions, we had to confront the issue
> of novel enum constants.  We considered a number of alternatives, and came
> up with throwing ICCE.  This was a reasonable choice, though as it turns
> out is not one that scales as well as we had hoped it would at the time.
> The choice here is based on "the view of classfiles at compile time and run
> time has shifted in an incompatible way."  ICCE is, as Kevin pointed out, a
> reliable signal that your classpath is borked.
>
> We now have two precedents from which to extrapolate, but as it turns out,
> neither is really very good for the general remainder case.
>
> Recall that we have a definition of _exhaustiveness_, which is, at some
> level, deliberately not exhaustive.  We know that there are edge cases for
> which it is counterproductive to insist that the user explicitly cover,
> often for two reasons: one is that its annoying to the user (writing cases
> for things they believe should never happen), and the other that it
> undermines type checking (the most common way to do this is a default
> clause, which can sweep other errors under the rug.)
>
> If we have an exhaustive set of patterns on a type, the set of possible
> values for that type that are not covered by some pattern in the set is
> called the _remainder_.  Computing the remainder exactly is hard, but
> computing an upper bound on the remainder is pretty easy.  I'll say "x may
> be in the remainder of P* on T" to indicate that we're defining the upper
> bound.
>
>  - If P* contains a deconstruction pattern P(Q*), null may be in the
> remainder of P*.
>  - If T is sealed, instances of a novel subtype of T may be in the
> remainder of P*.
>  - If T is an enum, novel enum constants of T may be in the remainder of
> P*.
>  - If R(X x, Y y) is a record, and x is in the remainder of Q* on X, then
> `R(x, any)` may be in the remainder of { R(q) : q in Q*} on R.
>
> Examples:
>
>     sealed interface X permits X1, X2 { }
>     record X1(String s) implements X { }
>     record X2(String s) implements X { }
>
>     record R(X x1, X x2) { }
>
>     switch (r) {
>          case R(X1(String s), any):
>          case R(X2(String s), X1(String s)):
>          case R(X2(String s), X2(String s)):
>     }
>
> This switch is exhaustive.  Let N be a novel subtype of X.  So the
> remainder includes:
>
>     null, R(N, _), R(_, N), R(null, _), R(X2, null)
>
> It might be tempting to argue (in fact, someone has) that we should try to
> pick a "root cause" (null or novel) and throw that.  But I think this is
> both excessive and unworkable.
>
> Excessive: This means that the compiler would have to enumerate the
> remainder set (its a set of patterns, so this is doable) and insert an
> extra synthetic clause for each.  This is a lot of code footprint and
> complexity for a questionable benefit, and the sort of place where bugs
> hide.
>
> Unworkable: Ultimately such code will have to make an arbitrary choice,
> because R(N, null) and R(null, N) are in the remainder set.  So which is
> the root cause?  Null or novel?  We'd have to make an arbitrary choice.
>
>
> So what I propose is the following simple answer instead:
>
>  - If the switch target is null and no case handles null, throw NPE.  (We
> know statically whether any case handles null, so this is easy and similar
> to what we do today.)
>  - If the switch is an exhaustive enum switch, and no case handles the
> target, throw ICCE.  (Again, we know statically whether the switch is over
> an enum type.)
>  - In any other case of an exhaustive switch for which no case handles the
> target, we throw a new exception type, java.lang.MatchException, with an
> error message indicating remainder.
>
> The first two rules are basically dictated by compatibility.  In
> hindsight, we might have not chosen ICCE in 12, and gone with the general
> (third) rule instead, but that's water under the bridge.
>
> We need to wrap this up in the next few days, so if you've concerns here,
> please get them on the record ASAP.
>
>
> As a separate but not-separate exception problem, we have to deal with at
> least two additional sources of exceptions:
>
>  - A dtor / record acessor may throw an arbitrary exception in the course
> of evaluating whether a case matches.
>
>  - User code in the switch may throw an arbitrary exception.
>
> For the latter, this has always been handled by having the switch
> terminate abruptly with the same exception, and we should continue to do
> this.
>
> For the former, we surely do not want to swallow this exception (such an
> exception indicates a bug).  The choices here are to treat this the same
> way we do with user code, throwing it out of the switch, or to wrap with
> MatchException.
>
> I prefer the latter -- wrapping with MatchException -- because the
> exception is thrown from synthetic code between the user code and the
> ultimate thrower, which means the pattern matching feature is mediating
> access to the thrower.  I think we should handle this as "if a pattern
> invoked from pattern matching completes abruptly by throwing X, pattern
> matching completes abruptly with MatchException", because the specific X is
> not a detail we want the user to bind to.  (We don't want them to bind to
> anything, but if they do, we want them to bind to the logical action, not
> the implementation details.)
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20220331/7d7ec597/attachment.htm>