Type primitive pattern: the problem with lossy conversions
Brian Goetz
brian.goetz at oracle.com
Fri Sep 12 17:30:57 UTC 2025
On 9/11/2025 11:55 AM, Brian Goetz wrote:
>
> I explicitly asked you to answer a question on the semantics when
> these conversions were NOT involved before we moved onto these,
> because I think you are conflating two separate things, and in order
> to get past just repeating "but...lossy!", we have to separate them.
I see Remi has lost interest in this discussion, so I will play both
parts of the dialogue on his behalf now.
> Let's start with the easy ones:
>
> Object p = "foo"; // widen String to Object
> Object q = Integer.valueOf(3) // widen Integer to Object
> ...
> if (p instanceof String s) { ... } // yes it is
> if (q instanceof String s) { ... } // no it isn't
>
> We can widen String and Integer to Object; we can safely narrow p back
> to String, but we can't do so for q, because it is "outside the range"
> of references to String (which embeds in "references to Object".)
> Does this make sense so far?
Remi: It has always been this way.
> OK, now let's do int and long.
>
> long small = 3
> long big = Long.MAX_VALUE
>
> if (small instanceof int a) { ... } // yes, it is
> if (big instanceof int b) { ... } // no, it isn't
>
> What these questions are asking is: can I safely narrow these longs to
> int, just like the above. In the first case, I can -- just like with
> String and Object. In the second, I can't -- just like with Integer
> and Object. Do we agree these are the same?
Remi: Yes Socrates, I believe they are.
> OK, now let's do int and double.
>
> double zero = 0;
> double pi = 3.14d;
>
> if (zero instanceof int i) { ... }
> if (pi instanceof int i) { ... }
>
> Same thing! The first exactly encodes a number that is representable
> in int (i.e., could have arisen from widening an int to double), the
> latter does not.
Remi: It could be no other way, Socrates.
But we can replace double with float in the previous example, and
nothing changes:
> float zero = 0;
> float pi = 3.14f;
>
> if (zero instanceof int i) { ... }
> if (pi instanceof int i) { ... }
Here, we are asking a sound question: does the number encoded in this
`float` exactly represent an `int`. For `zero`, the answer is "of
course"; for `pi`, the answer is "obviously not."
Remi: Yes, Socrates, that is clearly evident.
I'm now entering guessing territory here, but I'm pretty sure I
understand what's making you uncomfortable. But, despite the clickbaity
headline and misplaced claims of wrongness, this really has nothing to
do with pattern matching at all! It has to do with the existing
regrettable treatment of some conversions (which we well understood are
problematic, and are working on), and the fact that by its very duality,
pattern matching _exposes_ the inconsistency that while most "implicit"
conversions (more precisely, those allowed in method and assignment
context) are required to be "safe", we allow several lossy conversions
in these contexts too (int <--> float, long <--> float, long <-->
double). (The argument then makes the leap that "because this exposes a
seeming inconsistency, the new feature must be wrong, and should be
changed." But it is neither fair nor beneficial to blame the son, who
is actually doing it right, for the sins of the fathers.)
In other words, you see these two cases as somehow so different that we
should roll back several years of progress just to avoid acknowledging
the inconsistency:
float f = 0;
if (f instanceof int i) { ... }
and
float g = 200_000_007; // lossy
if (g instanceof int i) { ... }
because in the first case, we are merely "recovering" the int-ness of
something that was an int all along, but in the second case, we are
_throwing away_ some of the int value and then trying to recover it, but
without the knowledge that a previous lossy operation happened.
But blaming pattern matching is blaming the messenger. Your beef is
with the assignment to g, that we allow a lossy assignment without at
least an explicit cast. And this does seem "inconsistent"! We
generally go out of our way to avoid lossy conversions in assignments
and method invocation (by allowing widening conversions but not
narrowing ones) -- except that the conversion int -> float is considered
(mystifyingly) a "widening conversion."
Except there's no mystery. This was a compromise made in 1995 borne of
a desire for Java to not seem "too surprising" to C programmers. So
this conversion (and two of its irresponsible friends) were
characterized as "widenings", when in fact they are not.
If I could go back to 1995 and argue against this, I would. But I can't
do that. What I can do is to acknowledge that this was a regrettable
misuse of the term widening, and not extrapolate from this behavior.
Can we change the language to disallow these conversions in assignment
and method context? Unlikely, that would break way too much code. We
can, though, clarify the terminology in the JLS, and start to issue
warnings for these lossy implicit conversions, and instead encourage an
explicit cast to emphasize the "convert to float, dammit" intentions of
the programmer (which is what we plan to do, we just haven't finalized
this plan yet.) But again, this has little to do with the proper
semantics of pattern matching; it is just that pattern matching is the
mirror that reveals the bad behavior of existing past mistakes more
clearly. It would be stupid to extrapolate this mistake forward into
pattern matching -- that would make it both more confusing and more
complicated. Instead, we admit our past mistakes and improve the
language as best as we can. In this case, there was a pretty good
answer here, despite the legacy warts.
Before I close this thread, I need to reiterate that just because there
is a seeming inconsistency, this is not necessarily evidence that the
_most recent move_ is a mistake. So next time, if you see an
inconsistency, instead of thumping your shoe on the table and crying
"mistake! mistake!", you could ask these questions instead:
- This seems like an inconsistency, but is it really?
- If this is an inconsistency, does that mean that one case or the
other is mistake?
- If there is a mistake, can it be fixed? If not, should we consider
changing course to avoid confusion, or are we better off living with a
small inconsistency to get a greater benefit?
These are the kinds of questions that language designers grapple with
every day.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20250912/1f0f1456/attachment-0001.htm>
More information about the amber-spec-observers
mailing list