Primitive type patterns

Brian Goetz brian.goetz at oracle.com
Fri Feb 25 21:45:44 UTC 2022


As a consequence of doing record patterns, we also grapple with 
primitive type patterns. Until now, we've only supported reference type 
patterns, which are simple:

  - A reference type pattern `T t` is applicable to a match target of 
type M if M can be cast to T without an unchecked warning.

  - A reference type pattern `T t` covers a match type M iff M <: T

  - A reference type pattern `T t` matches a value m of type M if M <: T 
|| m instanceof T

Two of these three characterizations are static computations 
(applicability and coverage); the third is a runtime test (matching).  
For each kind of pattern, we have to define all three of these.


#### Primitive type patterns in records

Record patterns necessitate the ability to write type patterns for any 
type that can be a record component.  If we have:

     record IntBox(int i) { }

then we want to be able to write:

     case IntBox(int i):

which means we need to be able to express type patterns for primitive 
types.


#### Relationship with assignment context

There is another constraint on primitive type patterns: the let/bind 
statement coming down the road.  Because a type pattern looks (not 
accidentally) like a local variable declaration, a let/bind we will want 
to align the semantics of "local variable declaration with initializer" 
and "let/bind with total type pattern".  Concretely:

     let String s = "foo";

is a pattern match against the (total) pattern `String s`, which 
introduces `s` into the remainder of the block.  Since let/bind is a 
generalization of local variable declaration with initialization, 
let/bind should align with locals where the two can express the same 
thing.  This means that the set of conversions allowed in assignment 
context (JLS 5.2) should also be supported by type patterns.

Of the conversions supported by 5.2, the only one that applies when both 
the initializer and local variable are of reference type is "widening 
reference", which the above match semantics (`T t` matches `m` when `M 
<: T`) support.  So we need to fill in the other three boxes of the 2x2 
matrix of { ref, primitive } x { ref, primitive }.

The conversions allowed in assignment context are:

  - Widening primitive -- `long l = anInt`
  - Narrowing primitive -- `byte b = 0L` (only applies to constants on RHS)
  - Widening reference -- `Object o = aString`
  - Widening reference + unbox -- where `<T extends Integer>`, `int i = t`
  - Widening reference + unbox + widening primitive -- `long l = t`
  - Unboxing -- `int i = anInteger` (may NPE)
  - Unboxing + widening primitive -- `long l = anInteger` (may NPE)
  - Boxing -- `Integer i = anInt`
  - Boxing + widening reference -- `Object o = anInt`


#### Boxing and unboxing

Suppose our match target is a box type, such as:

     record IntegerBox(Integer i) { }

Clearly we can match it with:

     case IntegerBox(Integer i):

If we want to align with assignment context, and support things like

     let int i = anInteger

(and if we didn't, this would likely be seen as a gratuitous gap between 
let/bind and local declaration), we need for `int i` to be applicable to 
`Integer`:

     case IntegerBox(int i):

There is one value of `Integer` that, when we try to unbox, causes 
trouble: null.  As of Java 5 when switching on wrapper types, we unbox 
eagerly, throwing NPE if the target is null. But pattern matching is 
conditional.  If we have:

     record Box<T>(T t) { }
     Box<Object> b;
     ...
     case Box(String s):

when we encounter values of Object that are not instances of String, we 
just don't match.  For unboxing, it should be the same; `int x` matches 
all non-null instances of `Integer`:

     case IntegerBox(int i):

Because `int i` matches all instances of Integer other than null, it is 
reasonable to say that `int i` _covers_ Integer, with remainder null, 
just like:

     Box<Box<String>> bbs;
     switch (bbs) {
         case Box(Box(String s)): ...
     }

covers the match target, with remainder Box(null).  When confronted with 
Box(null), the attempt to match the case doesn't throw, it just doesn't 
match; when we run out of cases, the switch can throw a last-ditch 
exception.  The same applies when unboxing would NPE.

In the other direction, a primitive can always be boxed to its wrapper 
(or a supertype), meaning `Integer x` is applicable to, and covers, int, 
short, char, and byte.


#### Primitive widening and narrowing

The pattern match equivalent of primitive widening is:

     let long l = anInt;

or

     case IntBox(long l):

(When we get to dtor patterns, we will have to deal with overload 
selection, but for record patterns, there is one canonical dtor.)  This 
seems uncontroversial, just as allowing `Object o` to match a `String` 
target.

Primitive narrowing is less obvious, but there's a strong argument for 
generalizing primitive narrowing in pattern matching beyond constants.  
We already have to deal with

     let byte b = 0L;

via primitive narrowing, but pattern matching is a conditional 
construct, and there's an obvious way to extend this.  Observe that when 
matching `Box(String s)` against a `Box<Object>`, this is equivalent 
(because the nested pattern is not total) to matching the target to 
`Box(var alpha)` and then further matching alpha to `String s`.  So if 
we have the primitive equivalent:

     case IntBox(short s)

then this should be the same as matching to `IntBox(var alpha)` (which 
is an int) and then matching that int to `short s`.  The semantics of 
such a match are a dynamic range check, which is analogous to a dynamic 
`instanceof` check.


#### Applicability

We can fill out the other three quadrants now.  We start with 
applicability, which is a static check to see if the pattern is even 
allowed against the target type.  The clauses that are richer than 
allowed in assignment context (and which we could consider deferring) 
are written in brackets.

  - A primitive type pattern `P p` should be applicable to a primitive 
target `q : Q` if P == Q, Q can be widened to P, q is a constant and Q 
can be narrowed to P [ or Q can be narrowed to P.  ]
  - A primitive type pattern `P p` should be applicable to a reference 
target T if T unboxes to P, or T unboxes to a primitive type that can be 
widened to P [ or if T unboxes to a primitive type that can be narrowed 
to P. ]
  - A reference type pattern `T t` should be applicable to a primitive 
target P if P boxes to a type that is cast-convertible to T

Note that we're _not_ trying to treat `case 0` as matching all of 
Integer 0, Short 0, and Long 0.


#### Coverage

We need to add corresponding rules for coverage (exhaustiveness).

  - A primitive type pattern `P p` covers any primitive type Q which can 
be widened to P;
  - A primitive type pattern `P p` covers P's box type (with remainder 
null);
  - A reference type pattern `T t` covers a primitive type P if P's box 
type is a subtype of T.


#### Matching

  - A primitive type pattern `P p` matches a primitive value q : Q if P 
== Q, Q can be widened to P, if q is a constant in the range of P [ or 
if q is in the range of P ]
  - A primitive type pattern `P p` matches a reference value t : T by 
unboxing t (and optionally widening to P) when t != null
  - A reference type pattern `T t` matches a primitive value `p : P` 
when box(p) instanceof T (always true given applicability rule)


#### Comparing with assignment context

Let's go through our table of assignment conversions, flipping `T t = e` 
around to `e instanceof T t`

  - Widening primitive -- `anInt instanceof long l`. Applicable because 
int can be widened to long.  Matches always.
  - Narrowing primitive -- `0L instanceof byte b`. Applicable because OL 
is a constant and long can be narrowed to byte.  Matches always.
  - Widening reference -- `aString instanceof Object o`. Applicable 
because String can be cast to Object without an unchecked warning.  
Matches always.
  - Widening reference + unbox -- where `<T extends Integer>`, `t 
instanceof int i`.  Applicable because T unboxes to int.
  - Widening reference + unbox + widening primitive -- `t instanceof 
long l`.  Applicable because T unboxes to int, and int can be widened to 
long.
  - Unboxing -- `anInteger instanceof int i`.  Applicable because 
Integer unboxes to int.  Will match all values except null.
  - Unboxing + widening primitive -- `anInteger instanceof long l`.  
Applicable because Integer unboxes to int, and int can be widened to 
long.  Will match all values except null.
  - Boxing -- `anInt instanceof Integer i`.  Applicable because int 
boxes to Integer.
  - Boxing + widening reference -- `anInt instanceof Object i`.  
Applicable because int boxes to Integer, which is cast-convertible to 
Object.  Matches when box(anInt) instanceof Object.

There is an additional case which our rules cover which are not allowed 
in assignment context:

  - Narrowing without constants -- `aLong instanceof int i`.  Allowed by 
bracketed rules.  Matches when the long is in the range of an int.


#### Looking to Valhalla

When we get to primitive classes, the rules about boxing and unboxing 
will translate to the widening/narrowing conversions between P and P.ref.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20220225/9b8365ac/attachment-0001.htm>


More information about the amber-spec-experts mailing list