Remainder in pattern matching
Alan Malloy
amalloy at google.com
Fri Apr 1 00:54:08 UTC 2022
It seems pretty hard to land anywhere other than where you've landed, for
most of this. I have the same sort of question as Dan: do we really want to
wrap exceptions thrown by other patterns? You say we want to discourage
patterns from throwing at all, and that's a lovely dream, but the behavior
of total patterns is to throw when they meet something in their remainder.
Since user-defined patterns will surely involve primitive patterns at some
point, there is the possibility that one of those primitive patterns
throws, which bubbles up as an exception thrown by a user-defined pattern.
On Wed, Mar 30, 2022 at 7:40 AM Brian Goetz <brian.goetz at oracle.com> wrote:
> We should have wrapped this up a while ago, so I apologize for the late
> notice, but we really have to wrap up exceptions thrown from pattern
> contexts (today, switch) when an exhaustive context encounters a
> remainder. I think there's really one one sane choice, and the only thing
> to discuss is the spelling, but let's go through it.
>
> In the beginning, nulls were special in switch. The first thing is to
> evaluate the switch operand; if it is null, switch threw NPE. (I don't
> think this was motivated by any overt null hostility, at least not at
> first; it came from unboxing, where we said "if its a box, unbox it", and
> the unboxing throws NPE, and the same treatment was later added to enums
> (though that came out in the same version) and strings.)
>
> We have since refined switch so that some switches accept null. But for
> those that don't, I see no other move besides "if the operand is null and
> there is no null handling case, throw NPE." Null will always be a special
> remainder value (when it appears in the remainder.)
>
> In Java 12, when we did switch expressions, we had to confront the issue
> of novel enum constants. We considered a number of alternatives, and came
> up with throwing ICCE. This was a reasonable choice, though as it turns
> out is not one that scales as well as we had hoped it would at the time.
> The choice here is based on "the view of classfiles at compile time and run
> time has shifted in an incompatible way." ICCE is, as Kevin pointed out, a
> reliable signal that your classpath is borked.
>
> We now have two precedents from which to extrapolate, but as it turns out,
> neither is really very good for the general remainder case.
>
> Recall that we have a definition of _exhaustiveness_, which is, at some
> level, deliberately not exhaustive. We know that there are edge cases for
> which it is counterproductive to insist that the user explicitly cover,
> often for two reasons: one is that its annoying to the user (writing cases
> for things they believe should never happen), and the other that it
> undermines type checking (the most common way to do this is a default
> clause, which can sweep other errors under the rug.)
>
> If we have an exhaustive set of patterns on a type, the set of possible
> values for that type that are not covered by some pattern in the set is
> called the _remainder_. Computing the remainder exactly is hard, but
> computing an upper bound on the remainder is pretty easy. I'll say "x may
> be in the remainder of P* on T" to indicate that we're defining the upper
> bound.
>
> - If P* contains a deconstruction pattern P(Q*), null may be in the
> remainder of P*.
> - If T is sealed, instances of a novel subtype of T may be in the
> remainder of P*.
> - If T is an enum, novel enum constants of T may be in the remainder of
> P*.
> - If R(X x, Y y) is a record, and x is in the remainder of Q* on X, then
> `R(x, any)` may be in the remainder of { R(q) : q in Q*} on R.
>
> Examples:
>
> sealed interface X permits X1, X2 { }
> record X1(String s) implements X { }
> record X2(String s) implements X { }
>
> record R(X x1, X x2) { }
>
> switch (r) {
> case R(X1(String s), any):
> case R(X2(String s), X1(String s)):
> case R(X2(String s), X2(String s)):
> }
>
> This switch is exhaustive. Let N be a novel subtype of X. So the
> remainder includes:
>
> null, R(N, _), R(_, N), R(null, _), R(X2, null)
>
> It might be tempting to argue (in fact, someone has) that we should try to
> pick a "root cause" (null or novel) and throw that. But I think this is
> both excessive and unworkable.
>
> Excessive: This means that the compiler would have to enumerate the
> remainder set (its a set of patterns, so this is doable) and insert an
> extra synthetic clause for each. This is a lot of code footprint and
> complexity for a questionable benefit, and the sort of place where bugs
> hide.
>
> Unworkable: Ultimately such code will have to make an arbitrary choice,
> because R(N, null) and R(null, N) are in the remainder set. So which is
> the root cause? Null or novel? We'd have to make an arbitrary choice.
>
>
> So what I propose is the following simple answer instead:
>
> - If the switch target is null and no case handles null, throw NPE. (We
> know statically whether any case handles null, so this is easy and similar
> to what we do today.)
> - If the switch is an exhaustive enum switch, and no case handles the
> target, throw ICCE. (Again, we know statically whether the switch is over
> an enum type.)
> - In any other case of an exhaustive switch for which no case handles the
> target, we throw a new exception type, java.lang.MatchException, with an
> error message indicating remainder.
>
> The first two rules are basically dictated by compatibility. In
> hindsight, we might have not chosen ICCE in 12, and gone with the general
> (third) rule instead, but that's water under the bridge.
>
> We need to wrap this up in the next few days, so if you've concerns here,
> please get them on the record ASAP.
>
>
> As a separate but not-separate exception problem, we have to deal with at
> least two additional sources of exceptions:
>
> - A dtor / record acessor may throw an arbitrary exception in the course
> of evaluating whether a case matches.
>
> - User code in the switch may throw an arbitrary exception.
>
> For the latter, this has always been handled by having the switch
> terminate abruptly with the same exception, and we should continue to do
> this.
>
> For the former, we surely do not want to swallow this exception (such an
> exception indicates a bug). The choices here are to treat this the same
> way we do with user code, throwing it out of the switch, or to wrap with
> MatchException.
>
> I prefer the latter -- wrapping with MatchException -- because the
> exception is thrown from synthetic code between the user code and the
> ultimate thrower, which means the pattern matching feature is mediating
> access to the thrower. I think we should handle this as "if a pattern
> invoked from pattern matching completes abruptly by throwing X, pattern
> matching completes abruptly with MatchException", because the specific X is
> not a detail we want the user to bind to. (We don't want them to bind to
> anything, but if they do, we want them to bind to the logical action, not
> the implementation details.)
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20220331/7d7ec597/attachment.htm>
More information about the amber-spec-experts
mailing list