Primitives in instanceof and patterns
Brian Goetz
brian.goetz at oracle.com
Fri Sep 9 18:07:41 UTC 2022
>
> The semantics you propose is not to emit a compile error but at
> runtime to check if the value "i" is beetween Short.MIN_VALUE and
> Short.MAX_VALUE.
>
> So there is perhaps a syntactic duality but clearly there is no
> semantics duality.
Of course there is a semantic duality here. Specifically, `int` and
`short` are related by an _embedding-projection pair_. Briefly: given
two sets A and B (think "B" for "bigger"), an approximation metric on B
(a complete partial ordering), and a pair of functions `e : A -> B` and
`p : B -> A`, they form an e-p pair if (a) p . e is the identity
function (dot is compose), and e . p produces an approximation of the
input (according to the metric.)
The details are not critical here (though this algebraic structure shows
up everywhere in our work if you look closely), but the point remains:
there is an algebraic duality here. Yes, when going in one direction,
no runtime tests are needed; when going in the other direction, because
it may be lossy in one direction, a runtime test is needed in that
direction. Just like with `instanceof String` / `case String s` today.
Anyway, I don't think you're saying what you really mean. Let's not get
caught up in silly arguments about what "dual" means; that won't be
helpful.
> Moreover, the semantics you propose is not aligned with the concept of
> data oriented programming which says that the data are more important
> than the code so that we should try to raise a compile error when the
> data changed to help the developer to change the code.
>
> If we take a simple example
> record Point(int x, int y) { }
> Point point = ...
> switch(point) {
> case Point(int i, int j) -> ...
> ...
> }
>
> let say know that we change Point to use longs
> record Point(long x, long y) { }
>
> With the semantics you propose, the code still compile but the pattern
> is now transformed to a partial pattern that will not match all Points
> but only the ones with x and y in between Integer.MIN_VALUE and
> Integer.MAX_VALUE.
This is an extraneous argument; if you change the declaration of Point
to take two Strings, of course all the use sites will change their
meaning. Maybe they'll still compile but mean something else, maybe
they will be errors. Patterns are not special here; the semantics of
nearly all language features (assignment, arithmetic, etc) will change
when you change the type of the underlying arguments. That the meaning
of patterns changes also when you change the types involved is just more
of the same.
> I believe this is exactly what Stephen Colbourne was complaining when
> we discussed the previous iteration of this spec, the semantics of the
> primtiive pattern change depending on the definition of the data.
I think what Stephen didn't like is that there is no syntactic
difference between a total and partial pattern at the use site. And I
get why that made him uncomfortable; it's a valid concern, and one could
imagine designing the language so that total and partial patterns look
different. This is one of the tradeoffs we have made; I do still think
we picked a good one.
> The remark of Tagir about array pattern also works here, having a
> named pattern like Short.asShort() makes the semantics far cleared
> because it disambiguate between a a pattern that request a conversion
> and a pattern that does a conversion because the data definition has
> changed.
If the language didn't support primitive widening in assignment / method
invocation context (like Golang does), and instead said "use
Integer::toLong (or Long::fromInteger) to convert int -> long", then
yes, the natural duality would be to also represent these as named
patterns; then conversions in both directions are mediated by API
points, total in one direction, partial in the other. But that's not
the language we have! The language we have allows us to provide an int
where a long is needed, and the language does the needful. Pattern
matching allows us to recover whether a value came from a certain type,
even after we've lost the static type information. Just as we can
recover the String-ness here:
Object o = "Foo";
if (o instanceof String s) { ... }
because reference type patterns are willing to conditionally reverse
reference widening, all the same arguments apply to
long n = 3;
if (n instanceof int i) { ... }
And not allowing this makes the language *more* complicated, because now
some conversions are reversible and some are not, for ad-hoc reasons
that no one will be able to understand. Can you offer any compelling
reason why we should be able to recover the String-ness of `o` after a
widening, but not the int-ness of `n` after a widening?
> And i'm still worry that we are muddying the water here, instanceof is
> about instance and subtypining relationship (hence the name),
> extending it to cover non-instance / primitive value is very confusing.
Sorry, this is a cheap rhetorical trick; declaring words to mean what
you want them to mean, and then pointing to that meaning as a way to
close the argument.
Yes, saying "instanceof T is about subtyping" is a useful mental model
*when the only types you can apply it to are those related by inclusion
polymorphism*." But the restriction of instanceof to reference types is
arbitrary (and we've already decided to allow patterns in instanceof,
which are surely not mere subtyping.)
Regardless, a better way to think about `instanceof` is that it is the
precondition for "would a cast to this type be safe and useful." In the
world where we restrict to reference types, the two notions coincide.
But the safe-cast-precondition is clearly more general (this is like the
difference between defining the function 2^n on Z, vs on R or C; of
course they have to agree at the integers, but the continuous
exponential function is far more useful than the discrete one.)
Moreover, the general mental model is just as simple: how do you know a
cast is safe? Ask instanceof. What does safe mean? No error or
material loss of precision.
A more reasonable way to state this objection would be: "most users
believe that `instnaceof` is purely about subtyping, and it will take
some work to bring them around to a more general interpretation, how are
we going to do that?"
Jumping up a level, you're throwing a lot of arguments at the wall that
mostly come down to "I don't like this feature, so let me try and
undermine it." That's not a particularly helpful way to go about this,
and none of the arguments so far have been very compelling (nor are they
new from the last time we went around on it.) I get that you would like
pattern matching to have a more "surface" role in the language; that's a
valid opinion. But I would also like you to try harder to understand
what we're trying to achieve and why we're pushing it deeper, and to
respond to the substance of the proposal rather than just saying "YAGNI".
(I strongly encourage everyone to re-read JLS Ch5, and to meditate on
*why* we have the particular conversions in the contexts we have.
They're complex, but not arbitrary; if you listen closely to the
specification, it sometimes whispers to you.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20220909/7cdab5ef/attachment-0001.htm>
More information about the amber-spec-observers
mailing list