Primitives in instanceof and patterns

forax at univ-mlv.fr forax at univ-mlv.fr
Sat Sep 10 08:58:01 UTC 2022


> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Friday, September 9, 2022 8:07:41 PM
> Subject: Re: Primitives in instanceof and patterns

>> The semantics you propose is not to emit a compile error but at runtime to check
>> if the value "i" is beetween Short.MIN_VALUE and Short.MAX_VALUE.

>> So there is perhaps a syntactic duality but clearly there is no semantics
>> duality.

> Of course there is a semantic duality here. Specifically, `int` and `short` are
> related by an _embedding-projection pair_. Briefly: given two sets A and B
> (think "B" for "bigger"), an approximation metric on B (a complete partial
> ordering), and a pair of functions `e : A -> B` and `p : B -> A`, they form an
> e-p pair if (a) p . e is the identity function (dot is compose), and e . p
> produces an approximation of the input (according to the metric.)

> The details are not critical here (though this algebraic structure shows up
> everywhere in our work if you look closely), but the point remains: there is an
> algebraic duality here. Yes, when going in one direction, no runtime tests are
> needed; when going in the other direction, because it may be lossy in one
> direction, a runtime test is needed in that direction. Just like with
> `instanceof String` / `case String s` today.

> Anyway, I don't think you're saying what you really mean. Let's not get caught
> up in silly arguments about what "dual" means; that won't be helpful.
I do not disagree with the fact that a dual exist, i disagree that the semantics you propose is a dual (or a good dual if you prefer). 
A cast on primitive type apply the same operation for all the possible values, the semantics you propose for checking if an integer is a short does not apply the same operation to all values. 

The semantics of the Java 19 of the type pattern with a primitive type is a better dual in my opinion. 

The idea that the semantics of a primitive type pattern has to be "useful" is a trap. 

[...] 

>> I believe this is exactly what Stephen Colbourne was complaining when we
>> discussed the previous iteration of this spec, the semantics of the primtiive
>> pattern change depending on the definition of the data.

> I think what Stephen didn't like is that there is no syntactic difference
> between a total and partial pattern at the use site. And I get why that made
> him uncomfortable; it's a valid concern, and one could imagine designing the
> language so that total and partial patterns look different. This is one of the
> tradeoffs we have made; I do still think we picked a good one.

>> The remark of Tagir about array pattern also works here, having a named pattern
>> like Short.asShort() makes the semantics far cleared because it disambiguate
>> between a a pattern that request a conversion and a pattern that does a
>> conversion because the data definition has changed.

> If the language didn't support primitive widening in assignment / method
> invocation context (like Golang does), and instead said "use Integer::toLong
> (or Long::fromInteger) to convert int -> long", then yes, the natural duality
> would be to also represent these as named patterns; then conversions in both
> directions are mediated by API points, total in one direction, partial in the
> other. But that's not the language we have! The language we have allows us to
> provide an int where a long is needed, and the language does the needful.
> Pattern matching allows us to recover whether a value came from a certain type,
> even after we've lost the static type information. Just as we can recover the
> String-ness here:

> Object o = "Foo";
> if (o instanceof String s) { ... }

> because reference type patterns are willing to conditionally reverse reference
> widening, all the same arguments apply to

> long n = 3;
> if (n instanceof int i) { ... }

> And not allowing this makes the language *more* complicated, because now some
> conversions are reversible and some are not, for ad-hoc reasons that no one
> will be able to understand. Can you offer any compelling reason why we should
> be able to recover the String-ness of `o` after a widening, but not the
> int-ness of `n` after a widening?
In the case of instanceof, the type is not lost because any instances keep a reference to its class at runtime, a long does not keep a secret class saying its really an integer in disguise. 

>> And i'm still worry that we are muddying the water here, instanceof is about
>> instance and subtypining relationship (hence the name), extending it to cover
>> non-instance / primitive value is very confusing.

> Sorry, this is a cheap rhetorical trick; declaring words to mean what you want
> them to mean, and then pointing to that meaning as a way to close the argument.

> Yes, saying "instanceof T is about subtyping" is a useful mental model *when the
> only types you can apply it to are those related by inclusion polymorphism*."
> But the restriction of instanceof to reference types is arbitrary (and we've
> already decided to allow patterns in instanceof, which are surely not mere
> subtyping.)

> Regardless, a better way to think about `instanceof` is that it is the
> precondition for "would a cast to this type be safe and useful." In the world
> where we restrict to reference types, the two notions coincide. But the
> safe-cast-precondition is clearly more general (this is like the difference
> between defining the function 2^n on Z, vs on R or C; of course they have to
> agree at the integers, but the continuous exponential function is far more
> useful than the discrete one.) Moreover, the general mental model is just as
> simple: how do you know a cast is safe? Ask instanceof. What does safe mean? No
> error or material loss of precision.
[...] 

"would a cast to this type be safe and useful." 

I think you are overstating how useful a pattern that do a range check is. 
There is no method in the JDK that takes an int convert it to a short if in the right range or throw an exception otherwise. 
It seems a better fit to a named pattern that to "default behavior" of the type pattern. 

I believe that defining a range check as a primitive type pattern is a too clever idea. 

[...] 

> (I strongly encourage everyone to re-read JLS Ch5, and to meditate on *why* we
> have the particular conversions in the contexts we have. They're complex, but
> not arbitrary; if you listen closely to the specification, it sometimes
> whispers to you.)

I don't disagree that users may want what you call the dual of a cast to primitive, i disagree that it has to come as a type pattern and not as a named pattern. 

Rémi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20220910/75980246/attachment-0001.htm>


More information about the amber-spec-observers mailing list