[patterns] Nullability in patterns, and pattern-aware constructs (again)
Brian Goetz
brian.goetz at oracle.com
Fri Jan 10 20:00:47 UTC 2020
Closing the loop, this raises the question of "what about instanceof and
total patterns"? I posit that the following locutions are silly:
if (e instanceof var x) { ... } // always true
if (e instanceof _) { ... } // always true
and probably should be banned. If we did, though, what about:
if (e instanceof Object o) { ... } // always true
if (e instanceof Object) { ... } // always true, but currently allowed
I would think we would ban the former as well, but we have to keep the
latter around for compatibility. (Which is partially why I discouraged
calling the latter an "anonymous pattern" in the spec, and instead
proposed to treat it as a different flavor of `instanceof`.)
SO, proposed: disallow "any" patterns (_, var x, or total T x) in
instanceof. Instanceof is for partial patterns.
Note that
Point p;
if (p instanceof Point(var x, var y)) { }
is total, but we would't want to disallow it, as this pattern could
still fail if p == null.
We might want to go a little further, and ban constant patterns in
instanceof too, since all of the following have simpler forms:
if (x instanceof null) { ... }
if (x instanceof "") { ... }
if (i instanceof 3) { ... }
Or not -- I suspect not.
On 1/8/2020 3:27 PM, Brian Goetz wrote:
> In the past, we've gone around a few times on nullability and pattern
> matching. Back when we were enamored of `T?` types over in Valhalla
> land, we tentatively landed on using `T?` also for nullable type
> patterns. But the bloom came off that rose pretty quickly, and
> Valhalla is moving away from it, and that makes it far less attractive
> in this context.
>
> There are a number of tangled concerns that we've tried a few times to
> unknot:
>
> - Construct nullability. Constructs to which we want to add pattern
> awareness (instanceof, switch) already have their own opinion about
> nulls. Instanceof always says false when presented with a null, and
> switch always NPEs.
>
> - Pattern nullability. Some patterns clearly would never match null
> (deconstruction patterns), whereas others (an "any" pattern, and
> surely the `null` constant pattern, if there was one) might make sense
> to match null.
>
> - Nesting vs top-level. Most of the time, we don't want to match
> null at the top level, but frequently in a nested position we do. This
> conflicts with...
>
> - Totality vs partiality. When a pattern is partial on the operand
> type (e.g., `case String` when the operand of switch is `Object`), it
> is almost never the case we want to match null (well, except for the
> `null` constant pattern), whereas when a pattern is total on the
> operand type (e.g., `case Object` in the same example), it is more
> justifiable to match null.
>
> - Refactoring friendliness. There are a number of cases that we
> would like to freely refactor back and forth (e.g., if-instanceof
> chain vs pattern switch). In particular, refactoring a switch on
> nested patterns to a nested switch (case Foo(T t), case Foo(U u) to a
> nested switch on T and U) is problematic under some of the
> interpretations of nested patterns.
>
> - Inference. It would be nice if a `var` pattern were simply
> inference for a type pattern, rather than some possibly-non-denotable
> union. (Both Scala and C# treat these differently, which means you
> have to choose between type inference and the desired semantics; I
> don't want to put users in the position of making this choice.)
>
>
> Let's try (again) to untangle these. A compelling example is this one:
>
> Box box;
> switch (box) {
> case Box(Chocolate c):
> case Box(Frog f):
> case Box(var o):
> }
>
> It would be highly confusing and error-prone for either of the first
> two patterns to match Box(null) -- given that Chocolate and Frog have
> no type relation (ok, maybe they both implement `Edible`), it should
> be perfectly safe to reorder the two. But, because the last pattern
> is so obviously total on boxes, it is quite likely that what the
> author wants is to match all remaining boxes, including those that
> contain null. (Further, it would be super-bad if there were _no_way to
> say "Match any Box, even if it contains null. While one might think
> this could be repaired with OR patterns, imagine that `Box` had N
> components -- we'd need to OR together 2^n patterns, with complex
> merging, to express all the possible combinations of nullity.)
>
> Scala and C# took the path of saying that "var" patterns are not just
> type inference, they are "any" patterns -- so `Box(Object o)` matches
> boxes containing a non-null payload, where `Box(var o)` matches all
> boxes. I find this choice to be both questionable (the story that
> `var` is just inference is nice) and also that it puts users in the
> position of having to choose between the semantics they want and being
> explicit about types. I see the expedience of it, but I do not think
> this is the right answer for Java.
>
>
> In the previous round, we posited that there were _type
> patterns_(denoted `T t`) and _nullable type patterns_(denoted `T? t`),
> which had the advantage that you could be explicit about what you
> wanted (nulls or not), and which was sort of banking on Valhalla
> plunking for the `T? ` notation. But without that, only having `T?`
> in patterns, and no where else, will stick out like a sore thumb.
>
> There are many ways to denote "T or null", of course:
>
> - Union types: `case (T|Null) t`
> - OR patterns: `case (T t) | (Null t)`, or `case (T t) | (null t)`
> (the former is a union with a null TYPE pattern, the latter with a
> null CONSTANT pattern)
> - Merging/fallthrough: `case T t, Null t`
> - Some way to spell "nullable T": `case T? t`, `case nullable T t`,
> `case T|null t`
>
> But, I don't see any of these as being all that attractive in the Box
> case, when the most likely outcome is that the user wants the last
> case to match all boxes.
>
>
> Here's a scheme that I think is workable, which we hovered near
> sometime in the past, and which I want to go back to. We'll start with
> the observation that `instanceof` and `switch` are currently hostile
> to nulls (instanceof says false, switch throws, and probably in the
> future, let/bind will do so also.)
>
> - We accept that some constructs may have legacy hostility to nulls
> (but, see below for a possible relaxation);
> - There are no "nullable type patterns", just type patterns;
> - Type patterns that are _total_ on their target operand (`case T` on
> an operand of type `U`, where `U <: T`) match null, and non-total type
> patterns do not.
> - Var patterns can be considered "just type inference" and will mean
> the same thing as a type pattern for the inferred type.
>
> In this world, the patterns that match null (if the construct allows
> it through) are `case null` and the total patterns -- which could be
> written `var x` (and maybe `_`, or maybe not), or `Object x`, or even
> a narrower type if the operand type is narrower.
>
> In our Box example, this means that the last case (whether written as
> `Box(var o)` or `Box(Object o)`) matches all boxes, including those
> containing null (because the nested pattern is total on the nested
> operand), but the first two cases do not.
>
> An objection raised against this scheme earlier is that readers will
> have to look at the declaration site of the pattern to know whether
> the nested pattern is total. This is a valid concern (to be traded off
> against the other valid concerns), but this does not seem so bad in
> practice to me -- it will be common to use var or other broad type, in
> which case it will be obvious.)
>
> One problem with this interpretation is that we can't trivially
> refactor from
>
> switch (o) {
> case Box(Chocolate c):
> case Box(Frog f):
> case Box(var o):
> }
>
> to
>
> switch (o) {
> case Box(var contents):
> switch (contents) {
> case Chocolate c:
> case Frog f:
> case Object o:
> }
> }
> }
>
> because the inner `switch(contents)` would NPE, because switch is
> null-hostile. Instead, the user would explicitly have to do an `if
> (contents == null)` test, and, if the intent was to handle null in the
> same way as the bottom case, some duplication of code would be
> needed. This is irritating, but I don't think it is disqualifying --
> it is in the same category of null irritants that we have throughout
> the language.
>
> Similarly, we lose the pleasing decomposition that the nested pattern
> `P(Q)` is the same pattern as `P(alpha) & alpha instanceof Q` when P's
> 1st component might be null and the pattern Q is total -- because of
> the existing null-hostility of `instanceof`. (This is not unlike the
> complaint that Optional doesn't follow the monad law, with a similar
> consequence -- and a similar justification.)
>
> So, summary:
> - the null constant pattern matches null;
> - "any" patterns match null;
> - A total type pattern is an "any" pattern;
> - var is just type inference;
> - no other patterns match null;
> - existing constructs retain their existing null behaviors.
>
>
> I'll follow up with a separate message about switch null-hostility.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20200110/cd5d843e/attachment-0001.htm>
More information about the amber-spec-experts
mailing list