Primitive type patterns - an alternative approach (JEP 507)
Brian Goetz
brian.goetz at oracle.com
Wed Oct 15 15:39:11 UTC 2025
I have some guesses about why you are still so upset by this feature.
Since you have raised the suspense level through indirection, I'll play
along by sharing my guesses before reading.
My guess about what is making you uncomfortable is that (in addition to
the obvious: you are still having trouble accepting that instanceof will
be taking on a larger role than "subtype test") is that the meaning of a
pattern (e.g., `case Foo f`) is determined _relative to the static type
of some match candidate, specified elsewhere_, such as the selector
expression of a switch (for a top level pattern) or the component type
of a record (for a nested pattern.)
Further, I'll guess that your proposal involves making conversion more
explicit by adding new syntax, either (a) distinguishing a total pattern
match and a partial one, or (b) distinguishing a pattern match involving
subtyping and one involving conversion. (If I had to bet further, I
would take (b))
Let's see how I did ... pretty close! You wanted to go _even more
explicit_ than (b) -- by explicitly naming both types (even though the
compiler already has them in hand.)
Zooming out, design almost always involves "lump vs split" choices; do
we highlight the specific differences between cases, or their
commonality? Java does a lot of lumping; for example, we use the `==`
operator to compare both references and primitives, but `==` on ints
means something different than on floats or object references. (But
also, it doesn't mean something different; it asks the same question:
"are these two things identical.") The choice to lump or split in any
given situation is of course situational, but Java tends to err on the
side of lumping, and my observation is that whenever Stephen comes with
a proposal, it is usually one that involves "more splitting." (Not a
criticism; it's valid philosophical viewpoint.)
In this case, you've observed that Java already does a lot of lumping
with conversions; a cast from A to B is (a) determined by the
combination of types A and B, and (b) its meaning can change drastically
depending on these types. (This isn't new; this is Java 1.0 stuff,
which got lumped further in Java 5 when autoboxing was added.)
(Pause for a brief public service announcement: if you are even reading
this far, you should go read JLS Chapter 5 at least once. More so than
any other feature, this is a case where, if you listen carefully to the
language, )
For those who didn't go and read JLS 5, here's the set of conversions
that are permitted in a casting context:
• an identity conversion (§5.1.1)
• a widening primitive conversion (§5.1.2)
• a narrowing primitive conversion (§5.1.3)
• a widening and narrowing primitive conversion (§5.1.4)
• a boxing conversion (§5.1.7)
• a boxing conversion followed by a widening reference conversion (§5.1.5)
• a widening reference conversion (§5.1.5)
• a widening reference conversion followed by an unboxing conversion
• a widening reference conversion followed by an unboxing conversion, then
followed by a widening primitive conversion
• a narrowing reference conversion (§5.1.6)
• a narrowing reference conversion followed by an unboxing conversion
• an unboxing conversion (§5.1.8)
• an unboxing conversion followed by a widening primitive conversion
That's a big list! When we see "cast A to B", we must look at the types
A and B, and decide if any of these conversions are applicable (it's not
obvious, but (effectively) given a combination of A and B, at most one
will be). There's a lot going on here -- casting is pretty lumpy! (The
fact that casting can mean one of 15 different things -- and most people
didn't even notice until now, shows that lumping is often a good idea;
it hides small distinctions in favor of general concepts.)
At root, what I think is making you uncomfortable is that:
int x = (int) anObject
and
int x = (int) aLong
use the same syntax, but (feel like they) mean different things. (In
other words, that casting (and all the other conversion contexts) is
lumpy.) And while this might not have bothered you too much before, the
fact that patterns can be _composed_ makes for more cases where the
meaning of something is determined by the types involved, but those
types are not right there in your face like they are in the lines
above. When you see
case Box(String s):
this might be an exhaustive pattern on Box (if the component of Box is a
String) or might be a partial pattern (if the component is, say,
Object.) The concept is not new, but the more expressive syntax -- the
one that lets you compose two patterns -- means that running down the
types involve requires some more work (here, you have to look at the
declaration of Box.)
But, this is nothing new in Java! This happens with overloading:
m(x)
could select different overloads of `m` based on the type of `x`. And it
happens with chaining:
x.foo()
.bar()
where we don't even know where to look for the `bar` method until we
know the return type of `foo()`.
At root, this proposal is bargaining with the worry that "but users
won't know what the code does if the types aren't right there in their
face." And while I get that fear, we don't always optimize language
design for "the most important thing is making sure no is confused,
ever"; that would probably result in a language no one wants to program
in. There's a reason we lump. (To take a silly-extreme version, would
you want to need different addition operators for `int` and `long`?
What would the table of operators in such a language look like?)
So, I get why you want to make these things explicit. But that's going
in the opposite direction the language is going. Valhalla is doing what
it can to heal this rift, not double down on it; the set of
"conversions" won't be baked into a table in JLS 5.5.
Let's take one example about where this is going:
record IntBox(int x) { }
switch (box) {
case IntBox(0) -> "zero";
...
}
The zero here is a "constant pattern". But what does it meant to match
the constant zero? We might think this is pretty complicated, given the
current state of conversions. We already have special pleading in JLS 5
for converting `int` constants to the smaller integral types (byte,
short, char), because no one wanted to live in a world where we had
sigils for every primitive type, such as `0b` and `0s` -- too splitty.
We like being able to use integer literals and let the compiler figure
out that `int 0` and `byte 0` are the same thing; that's a
mostly-satisfying lumping.
Now, let's say that our box contains not an int, but a Complex128, a new
numeric type. We are probably not going to add complex literals (e.g.,
1 + 2i) to the language; adding new literal forms for every new type
won't scale. But we will be able to let the author of Complex128 say
that `int` can be widened exactly to `Complex128` (note: this is not the
same thing as letting Complex128 define an "implicit conversion" to int;
it is more constrained). And "is this complex value zero" is a
reasonable question in this domain, so we might well want to ask:
switch (complexBox) {
case ComplexBox(0) -> "zero";
...
}
What does this mean? Well, the framework for primitive patterns gives
us a trivially simple semantics for such constant patterns, even when
the constant is of a different type than the value being tested; the
above pattern match is equivalent to:
case ComplexBox(int x) when x == 0 -> "zero";
That's it; once we give a clear meaning to "can this complex be cleanly
cast to int", the meaning of constant patterns falls out for free.
Now, I offer this example not as "so therefore all of JEP 507 is
justified", but as a window into the fact that JEP 507 doesn't exist in
a design vacuum -- it is related to numerous other directions the
language is going in (here, constant patterns, and user-definable
numeric types that are convertible with existing numeric types.) Some
JEPs are the end of their own story, whereas others are more about
leveling the ground for future progress.
On 10/15/2025 2:34 AM, Stephen Colebourne wrote:
> In the vein of JEP feedback, I believe it makes sense to support
> primitive types in pattern matching, and will make sense to support
> value types in the future. And I can see the great work that has been
> done so far to enable this.
>
> Unfortunately, I hate the proposed syntactic approach in JEP 507. It
> wasn't really clear to me as to *why* I hated the syntax until I had
> enough time to really think through what Java does in the area of
> primitive type casts, and why extending that as-is to pattern matching
> would IMO be a huge mistake.
>
> (Please note that I fully grasp the pedagogical approach wrt
> instanceof defending an unsafe cast, but no matter how much it is
> repeated, I don't buy it, and I don't believe it is good enough by
> itself.)
>
> To capture my thoughts, I've written up how Java's current approach to
> casts leads me to an alternative proposal - type conversion casts, and
> type conversion patterns:
> https://tinyurl.com/typeconvertjava1
>
> thanks
> Stephen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20251015/485cee0e/attachment-0001.htm>
More information about the amber-dev
mailing list