Primitives in instanceof and patterns

forax at univ-mlv.fr forax at univ-mlv.fr
Mon Sep 12 22:28:58 UTC 2022


----- Original Message -----
> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Sunday, September 11, 2022 4:48:04 PM
> Subject: Re: Primitives in instanceof and patterns

>>>
>>> I think you're falling into the trap of examining each conversion and
>>> asking "would I want a pattern to do this."
>> Given that only primitive widening casts are safe, allowing only primitive
>> widening is another way to answer to the question what a primitive type pattern
>> is.
>> You are proposing a semantics using range checks, that's the problem.
> 
> So, substitute "reference" for "primitive" in this argument, and you
> will see how silly it is: "since only reference widening is safe,
> allowing only reference widening would be 'another answer to what a
> reference type pattern is.'"  But that would also be a useless
> semantic.  You're caught up on "range checks", but that's not the
> important thing here.  Casting is the important thing.

In fact, primitive widening is not a good idea, see my anwser about the constant pattern.

> 
>> As an example, instanceof rules and the rules about overriding methods
>> are intimately linked, asking if a method override another is
>> equivalent to asking if their function types are a subtypes.
>> if int instanceof double is allowed, then B::m should override A::m
>>    class A {
>>      int m() { ... }
>>    }
>>    class B extends A {
>>      @Override
>>      double m() { ... }
>>    }
>>
>> This is what i meant by changing other rules.
> 
> Another cute argument, but no.  Covariant overriding is linked to
> *subtyping*.  Instanceof *happens to coincide* with subtyping right now
> (given its ad-hoc restrictions), but the causality goes the other way.
> (Casting also appeals to subtyping, through reference widening
> conversions.)  But this argument is like starting with "all men are
> moral" and "Socrates is a man" and concluding "All men are Socrates."
> 
> We can talk about whether it would be wise to align the definition of
> covariant overrides with conversions other than reference widening (and
> will likely come up again in Valhalla anyway), but this is by no means a
> forced move, and not tied to generalizing the semantics of instanceof.

If we can avoid ten different semantics for casting, pattern and overriding, etc. I think it's a win.

Valhalla is another can of worms, because you are prematurely assigning a semantics to instanceof int, so Valhalla can not retcon instanceof int to instaceof Qjava/lang/Integer; even if unlike the primitive type int, Qjava/lang/Integer; is an object.

There is another mismatch, int.class.isInstance(o) and o instanceof int are not aligned anymore.

> 
>> I found a way to explain clearly why a reference type pattern and a
>> primitive type pattern are different.
>>
>> Let suppose that the code compiles (to avoid the issues of the
>> separate compilation),
>> unlike a reference type pattern, the code executed for a primitive
>> type pattern is a function of *both* the declared type and the pattern
>> type.
> 
> So (a) untrue -- what code we execute for a reference type pattern does
> depend on the static types -- we may or may not generate an `instanceof`
> instruction, depending on whether the pattern is unconditional.  (The
> same is true for a cast; some casts are no-ops and generate no code.)

Please take a look to the examples, in both cases, if the code compile it means that the first pattern is conditional and the second unconditional.

> And (b), so what?  We're asking "would it be safe to cast x to T".
> Depending on the types X and T, we will have different code for the
> casting, so why is it unreasonable to have different code for asking
> whether it is castable ?

see below

> 
>>
>> By example, if i have a code like this, i've no idea what code is
>> executed for case Foo(int i) without having to go to the declaration
>> of Foo which is usually not collocated with the switch itself.
>>
>>    Foo foo = ...
>>    switch (foo) {
>>       case Foo(int i) -> {}
>>       case Foo(double d) -> {}
>>     }
> 
> Sigh, this argument again?  We've been through this extensively the
> first time around, with reference types, where you "had no idea what
> this code means" without looking at the declaration of the pattern.
> (Then, it was partiality and totality.)  I get that you didn't like that
> total and partial patterns don't look syntactically different, and that
> ship has sailed.  But this is the same argument warmed over.

Nope, please take a look to the example, in both cases, if the code compile it means that the first pattern is conditional and the second unconditional.

> 
> "What code will be executed" is irrelevant; what is relevant is the
> semantics.  Assuming a single deconstruction pattern for Foo, the first
> case asks "can the Foo's component be cast safely to int, and if so,
> please cast it for me".  It doesn't matter what code we use to answer
> that question or do the cast -- could be a narrowing, could be an
> unboxing, whatever.

It matters because the first pattern is conditional, so it's important to know the condition, at least when you debug.

> 
> You see the same thing today without patterns:
> 
>     var x = foo.getFoo();
>     int i = (int) x;
> 
> x could be a long, an int, an Integer, etc, but you don't know unless
> you look at the definition of getFoo().  And you have "no idea what code
> will be executed."  Sure, but so what?  You asked for a cast to int.
> The language validated that x is castable to int, and does what needs to
> be done, which might be nothing, or a widening, or a narrowing with
> truncation, or an unboxing, or some combination.

There is a big difference between 

  var x = foo.getFoo();
  if (x instanceof int) { ... }

and the code above when you are reading the code.

The issue with the semantics you propose is that the pattern express a condition but the condition is hidden.

With a cast there is no condition, it will be always executed.

> 
> (When we get to overloading deconstruction patterns, we'll have all the
> same issues as we have with overloading methods today -- it is not
> obvious looking only at the call site, which overload is called, and
> therefore which conversions are applied to arguments or returns.)

We do not need overloading of patterns !
I repeat.
We do not need overloading of patterns !

We need overloaded constructor because if the canonical constructor takes 3 arguments and we want a constructor with two, we have to provide a value.
In case of pattern methods / deconstructor, we can match the three arguments but with an '_' for the argument we want to drop.


> 
> As a reminder, here's what a nested pattern means:
> 
>     x matches P(Q) === x matches P(var q) && q matches Q
> 
> Understanding what is going to happen involves understanding the type of
> `q`.  I get that you didn't like that choice, and that's your right, but
> it's not OK to keep bringing it up as if its a new thing.

If the switch is exhaustive, the patterns below will usually (it's not fully true) help.
  switch(x) {
    case P(Q q) ->
    case P(R r) ->
  }  

the last one is unconditional so the first pattern does a r instanceof Q q.

> 
> 
> I think I actually understand your concern here, which has nothing to do
> with the dozen or so bogus examples and explanations you've tossed out
> so far.  It is that cast conversion is complicated, and you would like
> pattern matching to be "simple", and so pulling in the muck of cast
> conversion into pattern matching feels to you like an unforced error.
> Right?  (And if so, perhaps you could have just said that, instead of
> throwing random arguments at the wall?)

Nope,
let me recapitulate.

1) having a primitive pattern doing a range check is useless because this is rare that you want to do a range check + cast in real life,
  How many people have written a code like this

   int i = ...
   if (i >= Byte.MIN_VALUE && i <= Byte.MAX_VALUE) {
     byte b = (byte) i;
     ...
   }

  It's useful when you write a bytecode generator without using an existing library, ok, but how many write a bytecode generator ?
  It should not be the default behavior for the primitive type pattern.

2) It's also useless because there is no need to have it as a pattern, when you can use a cast in the following expression
   Person person = ...
   switch(person) {
     // instead of
     // case Person(double age) -> foo(age);
     // one can write
     case Person(int age) -> foo(age);  // widening cast
   }

3) when you read a conditional primitive patterns, you have no idea what is the underlying operation until you go to the declaration (unlike the code just above).


4) if we change the type pattern to be not just about subtyping, we should revisit the JLS to avoid to have too many different semantics.

Rémi


More information about the amber-spec-observers mailing list