Treatment of total patterns (was: Reviewing feedback on patterns in switch)

Wed Jan 26 04:20:24 UTC 2022

> Null is only matched by a switch case that includes `case null`.  Switches with no `case null` are treated as if they have a `case null: throw NPE`.  This means that `case Object o` doesn’t match null; only `case null, Object o` does.
> Total patterns are re-allowed in instanceof expressions, and are consistent with their legacy form.

I strongly support this change.

In my experience, it's much more important to have automatic
refactorings between switch and chains of 'if' than between nested and
flat switches. People have chains of 'if's very often and they are not
legacy. Sometimes, you want to add conditions unrelated to the
selector expression, so it could be natural to convert 'switch' to
'if'. In other cases, you simplify the chain of 'if' statements and
see that the new set of conditions nicely fits into a pattern switch.
These if<->switch conversions will be an everyday tool for developers.
In contrast, destructuring with a switch will be a comparatively rare
thing, and it's even more rare when you need to convert nested
switches to flat ones or vice versa. I'm saying this from my Kotlin
programming experience where you can have when-is and sort of
destructuring of data classes which are roughly similar to what we are
doing for Java. One level 'when' is common, two-level 'when' or
conditions on destructuring components are more rare.

We already implemented some kind of switch<->if conversion in IntelliJ
IDEA. And it already has a number of corner cases to handle in order
to support total patterns that match null. In particular, we cannot
convert `case Object obj` to `if (x instanceof Object obj), as total
patterns are prohibited for instanceof and null won't be matched
anyway. We cannot just omit a condition, as `obj` could be used
afterwards, so we have to explicitly declare a variable (and I
believe, this part is still buggy and may produce incompilable code).
The proposed change will make switch<->if refactorings more mechanical
and predictable.

Another thing I mentioned before and want to stress again is that this
change will allow us to infer required nullity for the variable used
in the switch selector from AST only. No need to use resolution or
type inference. This will make interprocedural analysis stronger.
E.g., consider:
// Test.java
class Test {
  static void test(A a) {
    switch(a) {
    case B b -> {}
    case C c -> {}
    case D d -> {}
    }
  }
}

There are two possibilities:
1. D is a superclass of A, thus the last pattern is total, and null is
accepted here:

interface D {}
interface A extends D {}
interface B extends A {}
interface C extends A {}

2. A is a sealed type with B, C, and D inheritors, switch is
exhaustive, and null is not accepted here:

sealed interface A {}
non-sealed interface B extends A {}
non-sealed interface C extends A {}
non-sealed interface D extends A {}

So without looking at A definition (which might be anywhere) we cannot
say whether test(null) will throw NPE or not. We cannot cache the
knowledge about 'test' method parameter nullability within the
Test.java, because its nullability might change if we change the
hierarchy of A, even if Test.java content is untouched. Currently, we
are conservative and not infer nullability when any unguarded pattern
appears in switch cases. With the required `case null`, we can perform
more precise analysis.

With best regards,
Tagir Valeev.

On Wed, Jan 26, 2022 at 2:47 AM Brian Goetz <brian.goetz at oracle.com> wrote:
>
>
> 1.  Treatment of total patterns in switch / instanceof
>
>
> The handling of totality has been a long and painful discussion, trying to balance between where we want this feature to land in the long term, and people’s existing mental models of what switch and instanceof are supposed to do.  Because we’ve made the conscious decision to rehabilitate switch rather than make a new construct (which would live side by side with the old construct forever), we have to take into account the preconceived mental models to a greater degree.
>
> Totality is a predicate on a pattern and the static type of its match target; for a pattern P to be total on T, it means that all values of T are matched by P.  Note that when I say “matched by”, I am appealing not necessarily to “what does switch do” or “what does instanceof do”, but to an abstract notion of matching.
>
> The main place where there is a gap between pattern totality and whether a pattern matches in a switch has to do with null.  We’ve done a nice job retrofitting “case null” onto switch (especially with `case null, Foo f` which allows the null to be bound to f), but people are still uncomfortable with `case Object o` binding null to o.
>
> (Another place there is a gap is with nested patterns; Box(Bag(String s)) should be total on Box<Bag<String>>, but can’t match Box(null).   We don’t want to force users to add default cases, but a switch on Box<Bag<String>> would need an implicit throwing case to deal with the remainder.)
>
> I am not inclined to reopen the “should `Object o` be total” discussion; I really don’t think there’s anything new to say there.  But we can refine the interaction between a total pattern and what the switch and instanceof constructs might do.  Just because `Object o` is total on Object, doesn’t mean `case Object o` has to match it.  It is the latter I am suggesting we might reopen.
>
> The motivation for treating total patterns as total (and therefore nullable) in switch comes in part from the desire to avoid introducing sharp edges into refactoring.  Specifically, we had two invariants in mind:
>
>     x matches P(Q) === x matches P(var alpha) && alpha matches Q:
>
> and
>
>     switch (x) {
>         case P(Q): A
>         case P(T): B
>     }
>
> where T is total on the type of x, should be equivalent to
>
>     switch (x) {
>         case P(var alpha):
>             switch(alpha) {
>                 case Q: A
>                 case T: B
>             }
>         }
>    }
>
> These invariants are powerful both for linguistic transformation and for refactoring.
>
> The refinements I propose are:
>
>  - Null is only matched by a switch case that includes `case null`.  Switches with no `case null` are treated as if they have a `case null: throw NPE`.  This means that `case Object o` doesn’t match null; only `case null, Object o` does.
>
>  - Total patterns are re-allowed in instanceof expressions, and are consistent with their legacy form.
>
> Essentially, this gives switch and instanceof a chance to treat null specially with their existing semantics, which takes precedence over the pattern match.
>
> The consequences of this for our refactoring rules are:
>
>  - When unrolling a nested pattern P(Q), we can only do so when Q is not total.
>  - When unrolling a switch over nested patterns to a nested switch, `case P(T)` must be unrolled not to `case T`, but `case null, T`.
>
>
> These changes entail no changes to the semantics of pattern matching; they are changes to the semantics of instanceof/switch with regard to null.
>