Type primitive pattern: the problem with lossy conversions
Brian Goetz
brian.goetz at oracle.com
Fri Sep 12 18:32:06 UTC 2025
Indeed so. Some people are having a hard time making the shift away
from `instanceof` as being purely a type query. And this is one of the
costs of the decision we made to "lump" vs "split" in the surface syntax
when we added patterns at all; we had a choice of picking a new keyword
(e.g., "matches") or extending instanceof to support patterns. This
choice had pros and cons on both sides of the ledger, and having made a
choice, we now have to live with whatever cons that choice had. (FTR I
am completely convinced that this was the better choice, but that
doesn't make the cons go away.)
In C#, they spelled their `instanceof` operator `is`, and in that world,
the "this is only about types" interpretation is less entrenched, for
purely syntactic reasons; both
if (anObject is String)
and
if (aFloat is int)
seems pretty natural. It will take some time for people to reprogram
this accidental association, but once they do, it won't be a problem.
On 9/12/2025 2:01 PM, Archie Cobbs wrote:
> This is just an aside (I agree with everything Brian has said).
>
> I think a lot of the "uncomfortableness" comes from a simple mental
> model adjustment that needs to occur.
>
> If "42 instanceof float" feels normal to you, then no adjustment is
> needed. But I think for some people it feels funny.
>
> Why? Because those people tend to read "x instanceof Y" as "the type
> of x is some subtype of Y".
>
> Why is that a problem? Well, what is "the type of x"? For reference
> types, there is the compiler type and the runtime type, and instanceof
> is a way to ask about the runtime type of something that the code only
> knows by its compile-time type. So far, so good.
>
> But with primitives, there is no distinction between compiler type and
> runtime type. An "int" is always an ordered sequence of 32 bits.
>
> So applying the traditional understanding of "instanceof" leads you to
> this: "42 instanceof float" is obviously false, because the statement
> "int is some subtype of float" is false.
>
> So, int is not a subtype of float - but some ints are representable as
> floats. The latter property is what matters here.
>
> So perhaps when we talk about this feature, we should start by first
> telling everyone to replace any notion they might have that "x
> instanceof Y" means "type of x is some subtype of Y" with "the value x
> is representable as a Y". After that, the waters should become much
> smoother.
>
> -Archie
>
>
> On Fri, Sep 12, 2025 at 12:31 PM Brian Goetz <brian.goetz at oracle.com>
> wrote:
>
>
>
> On 9/11/2025 11:55 AM, Brian Goetz wrote:
>>
>> I explicitly asked you to answer a question on the semantics when
>> these conversions were NOT involved before we moved onto these,
>> because I think you are conflating two separate things, and in
>> order to get past just repeating "but...lossy!", we have to
>> separate them.
>
> I see Remi has lost interest in this discussion, so I will play
> both parts of the dialogue on his behalf now.
>
>> Let's start with the easy ones:
>>
>> Object p = "foo"; // widen String to Object
>> Object q = Integer.valueOf(3) // widen Integer to Object
>> ...
>> if (p instanceof String s) { ... } // yes it is
>> if (q instanceof String s) { ... } // no it isn't
>>
>> We can widen String and Integer to Object; we can safely narrow p
>> back to String, but we can't do so for q, because it is "outside
>> the range" of references to String (which embeds in "references
>> to Object".) Does this make sense so far?
>
> Remi: It has always been this way.
>
>> OK, now let's do int and long.
>>
>> long small = 3
>> long big = Long.MAX_VALUE
>>
>> if (small instanceof int a) { ... } // yes, it is
>> if (big instanceof int b) { ... } // no, it isn't
>>
>> What these questions are asking is: can I safely narrow these
>> longs to int, just like the above. In the first case, I can --
>> just like with String and Object. In the second, I can't -- just
>> like with Integer and Object. Do we agree these are the same?
>
> Remi: Yes Socrates, I believe they are.
>
>> OK, now let's do int and double.
>>
>> double zero = 0;
>> double pi = 3.14d;
>>
>> if (zero instanceof int i) { ... }
>> if (pi instanceof int i) { ... }
>>
>> Same thing! The first exactly encodes a number that is
>> representable in int (i.e., could have arisen from widening an
>> int to double), the latter does not.
>
> Remi: It could be no other way, Socrates.
>
> But we can replace double with float in the previous example, and
> nothing changes:
>
>> float zero = 0;
>> float pi = 3.14f;
>>
>> if (zero instanceof int i) { ... }
>> if (pi instanceof int i) { ... }
>
> Here, we are asking a sound question: does the number encoded in
> this `float` exactly represent an `int`. For `zero`, the answer
> is "of course"; for `pi`, the answer is "obviously not."
>
> Remi: Yes, Socrates, that is clearly evident.
>
>
> I'm now entering guessing territory here, but I'm pretty sure I
> understand what's making you uncomfortable. But, despite the
> clickbaity headline and misplaced claims of wrongness, this really
> has nothing to do with pattern matching at all! It has to do with
> the existing regrettable treatment of some conversions (which we
> well understood are problematic, and are working on), and the fact
> that by its very duality, pattern matching _exposes_ the
> inconsistency that while most "implicit" conversions (more
> precisely, those allowed in method and assignment context) are
> required to be "safe", we allow several lossy conversions in these
> contexts too (int <--> float, long <--> float, long <--> double).
> (The argument then makes the leap that "because this exposes a
> seeming inconsistency, the new feature must be wrong, and should
> be changed." But it is neither fair nor beneficial to blame the
> son, who is actually doing it right, for the sins of the fathers.)
>
> In other words, you see these two cases as somehow so different
> that we should roll back several years of progress just to avoid
> acknowledging the inconsistency:
>
> float f = 0;
> if (f instanceof int i) { ... }
>
> and
>
> float g = 200_000_007; // lossy
> if (g instanceof int i) { ... }
>
> because in the first case, we are merely "recovering" the int-ness
> of something that was an int all along, but in the second case, we
> are _throwing away_ some of the int value and then trying to
> recover it, but without the knowledge that a previous lossy
> operation happened.
>
> But blaming pattern matching is blaming the messenger. Your beef
> is with the assignment to g, that we allow a lossy assignment
> without at least an explicit cast. And this does seem
> "inconsistent"! We generally go out of our way to avoid lossy
> conversions in assignments and method invocation (by allowing
> widening conversions but not narrowing ones) -- except that the
> conversion int -> float is considered (mystifyingly) a "widening
> conversion."
>
> Except there's no mystery. This was a compromise made in 1995
> borne of a desire for Java to not seem "too surprising" to C
> programmers. So this conversion (and two of its irresponsible
> friends) were characterized as "widenings", when in fact they are
> not.
>
> If I could go back to 1995 and argue against this, I would. But I
> can't do that. What I can do is to acknowledge that this was a
> regrettable misuse of the term widening, and not extrapolate from
> this behavior. Can we change the language to disallow these
> conversions in assignment and method context? Unlikely, that
> would break way too much code. We can, though, clarify the
> terminology in the JLS, and start to issue warnings for these
> lossy implicit conversions, and instead encourage an explicit cast
> to emphasize the "convert to float, dammit" intentions of the
> programmer (which is what we plan to do, we just haven't finalized
> this plan yet.) But again, this has little to do with the proper
> semantics of pattern matching; it is just that pattern matching is
> the mirror that reveals the bad behavior of existing past mistakes
> more clearly. It would be stupid to extrapolate this mistake
> forward into pattern matching -- that would make it both more
> confusing and more complicated. Instead, we admit our past
> mistakes and improve the language as best as we can. In this
> case, there was a pretty good answer here, despite the legacy warts.
>
> Before I close this thread, I need to reiterate that just because
> there is a seeming inconsistency, this is not necessarily evidence
> that the _most recent move_ is a mistake. So next time, if you
> see an inconsistency, instead of thumping your shoe on the table
> and crying "mistake! mistake!", you could ask these questions instead:
>
> - This seems like an inconsistency, but is it really?
> - If this is an inconsistency, does that mean that one case or
> the other is mistake?
> - If there is a mistake, can it be fixed? If not, should we
> consider changing course to avoid confusion, or are we better off
> living with a small inconsistency to get a greater benefit?
>
> These are the kinds of questions that language designers grapple
> with every day.
>
>
>
>
> --
> Archie L. Cobbs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20250912/3327904d/attachment-0001.htm>
More information about the amber-spec-observers
mailing list