Finalizing in JDK 16 - Pattern matching for instanceof

Wed Aug 26 15:00:47 UTC 2020

I have been thinking about this and I have two refinements I would like 
to suggest for Pattern Matching in instanceof.  Both have come out of 
the further work on the next phases of pattern matching.

1.  Instanceof expressions must express conditionality.

One of the uncomfortable collisions between the independently-derived 
pattern semantics and instanceof semantics is the treatment of total 
patterns.  Instanceof always says "no" on null, but the sensible thing 
on total patterns is that _strongly total patterns_ match null.  This 
yields a collision between

     x instanceof Object
and
     x instanceof Object o

This is not necessarily a problem for the specification, in that 
instanceof is free to say "when x is null, we don't even test the 
pattern."  But it is not good for the users, in that these two things 
are subtly different.

While I get why some people would like to bootstrap this into an 
argument why the pattern semantics are wrong, the key observation here 
is: _both of these questions are stupid_.  So I think there's an obvious 
way to fix this so that there is no problem here: instanceof must ask a 
question.  So the second form would be illegal, with a compiler error 
saying "pattern always matches the target."

Proposed: An `instanceof` expression must be able to evaluate to both 
true and false, otherwise it is invalid.  This rules out strongly total 
patterns on the RHS.  If you have a strongly total pattern, use pattern 
assignment instead.

2.  Mutability of binding variables.

We did it again; we gave in to our desire to try to "fix mistakes of the 
past", with the obvious results.  This time, we did it by making binding 
variables implicitly final.

This is the same mistake we make over and over again with both nullity 
and finality; when a new context comes up, we try to exclude the 
"mistakes" (nullability and mutability) from those contexts.

We've seen plenty of examples recently with nullity.  Here's a 
historical example with finality.  When we did Lambda, some clever 
fellow said "we could make the lambda parameters implicitly final."  And 
there was a round of "ooh, that would be nice", because it fed our 
desire to fix mistakes of the past. But we quickly realized it would be 
a new mistake, because it would be an impediment to refactoring between 
lambdas and inner classes, and undermined the mental model of "a lambda 
is just an anonymous method."

Further, the asymmetry has a user-model cost.  And what would be the 
benefit?  Well, it would make us feel better, but ultimately, would not 
have a significant impact on accidental-mutation errors because the 
context was so limited (and most lambdas are small anyway.)  In the end, 
it would have been a huge mistake.

I now think that we have done the same with binding variables. Here are 
two motivating examples:

(a) Pattern assignment.  For (weakly) total pattern P, you will be able 
to say

     P = e

Note that `int x` and `var x` are both valid patterns and local variable 
declarations; it would be good if pattern assignment were a strict 
generalization of local variable declaration.  The sole asymmetry is 
that for pattern assignment, the variable is final.  Ooops.

(b) Reconstruction.  We have analogized that a `with` expression:

     x with { B }

is like the block expression:

     { X(VARS) = x; B /* mutates vars */; yield new X(VARS) }

except that mutating the variables would not be allowed.

 From a specification perspective, there is nontrivial spec complexity 
to keep pattern variables and locals separately, but some of their 
difference is gratuitous (mutability.)  If we reduce the gratuitious 
differences, we can likely bring them closer together, which will reduce 
friction and technical debt in the future.

Like with lambda parameters, I am now thinking that we gave in to the 
base desire to fix a past mistake, but in a way that doesn't really make 
the language better or safer, just more complicated.  Let's back this 
one out before it really bites us.

On 7/27/2020 6:53 AM, Gavin Bierman wrote:
> In JDK 16 we are planning to finalize two JEPs:
>
>    - Pattern matching for `instanceof`
>    - Records
>
> Whilst we don't have any major open issues for either of these features, I would
> like us to close them out. So I thought it would be useful to quickly summarize
> the features and the issues that have arisen over the preview periods so far. In
> this email I will discuss pattern matching; a following email will cover the
> Records feature.
>
> Pattern matching
> ----------------
>
> Adding conditional pattern matching to an expression form is the main technical
> novelty of our design of this feature. There are several advantages that come
> from this targeting of an expression form: First, we get to refactor a very
> common programming pattern:
>
>      if (e instanceof T) {
>          T t = (T)e;         // grr...
>          ...
>      }
>
> to
>
>      if (e instanceof T t) {
>                              // let the pattern matching do the work!
>          ...
>      }
>
> A second, less obvious advantage is that we can combine the pattern matching
> instanceof with other *expressions*. This enables us to compactly express things
> with expressions that are unnecessarily complicated using statements. For
> example, when implementing a class Point, we might write an equals method as
> follows:
>
>      public boolean equals(Object o) {
>          if (!(o instanceof Point))
>              return false;
>          Point other = (Point) o;
>          return x == other.x
>              && y == other.y;
>      }
>
> Using pattern matching with instanceof instead, we can combine this into a
> single expression, eliminating the repetition and simplifying the control flow:
>
>      public boolean equals(Object o) {
>          return (o instanceof Point other)
>              && x == other.x
>              && y == other.y;
>      }
>
> The conditionality of pattern matching - if a value does not match a pattern,
> then the pattern variable is not bound - means that we have to consider
> carefully the scope of the pattern variable. We could do something simple and
> say that the scope of the pattern variable is the containing statement and all
> subsequent statements in the enclosing block. But this has unfortunate
> 'poisoning' consequences, e.g.
>
>      if (a instanceof Point p) {
>          ...
>      }
>      if (b instanceof Point p) {         // ERROR - p is in scope
>          ...
>      }
>
> In other words in the second statement the pattern variable is in a poisoned
> state - it is in scope, but it should not be accessible as it may not be
> instantiated with a value. Moreover, as it is in scope, we can't declare it
> again. This means that a pattern variable is 'poisoned' after it is declared, so
> the pattern-loving programmer will have to think of lots of distinct names for
> their pattern variables.
>
> We have chosen another way: Java already uses flow analysis - both in checking
> the access of local variables and blank final fields, and detecting unreachable
> statements. We lean on this concept to introduce the new notion of flow scoping.
> A pattern variable is only in scope where the compiler can deduce that the
> pattern has matched and the variable will be bound. This analysis is flow
> sensitive and works in a similar way to the existing analyses. Returning to our
> example:
>
>      if (a instanceof Point p) {
>          // p is in scope
>          ...
>      }
>      // p not in scope here
>      if (b instanceof Point p) {     // Sure!
>              ...
>      }
>
> The motto is "a pattern variable is in scope where it has definitely matched".
> This is intuitive, allows for the safe reuse of pattern variables, and Java
> developers are already used to flow sensitive analyses.
>
> As pattern variables are treated in all other respects like normal variables
> -- and this was an important design principle -- they can shadow fields.
> However, their flow scoping nature means that some care must be taken to
> determine whether a name refers to a pattern variable declaration shadowing a
> field declaration or a field declaration.
>
>      // field p is in scope
>
>      if (e instanceof Point p) {
>          // p refers to the pattern variable
>      } else {
>          // p refers to the field
>      }
>
> We call this unfortunate interaction of flow scoping and shadowing the "Swiss
> cheese property". To rule it out would require ad-hoc special cases or more
> features, and our sense is that will not be that common, so we have decided to
> keep the feature simple. We hope that IDEs will quickly come to help programmers
> who have difficulty with flow scoping and shadowing.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20200826/48890908/attachment.htm>