value type hygiene

John Rose john.r.rose at oracle.com
Tue May 15 22:13:19 UTC 2018


> On May 15, 2018, at 12:32 PM, Paul Sandoz <paul.sandoz at oracle.com> wrote:
> 
> 
> 
>> On May 14, 2018, at 11:36 PM, John Rose <john.r.rose at oracle.com> wrote:
>> ...
>> Eventually when int[] <: Object[], then int[].class.getClass().getDefaultValue()
>> will return an appropriate zero value, at which point the above behavior will
>> "work like an int".
>> 
>> Another way to make this API point "work like an int" would be to throw an
>> exception (ASE or the like), on the grounds that you can't store a null into
>> an int[] so you shouldn't be able to store a null into a Point[].
>> 
> 
> A third approach could be to check if the array is non-nullable and not store a default value, which may be surprising, but storing a default is arguably less useful in general for arrays of value types but it is suppose mostly harmless (i am thinking of cases where a value type has a default that is hostile to be operated on, like perhaps LocalDate).

Sure, it could just do nothing if the array doesn't accept a null.

It really depends on what "null" means in this use case.

Remember the basic reason that toArray puts a null into the first
unused spot:  It is restoring it to (approximately) what the array
looked like (at that spot) when it was first created.  The toArray
function doesn't know that this is a valid sentinel, but it is the
most reasonable deterministic value to store, since it was there
at the beginning.  And, in the common case where toArray is
handed a fresh array (but unfortunately too long), the stored
null has no effect at all:  It overwrites the null that was there
from the beginning.

Now, hmmm, how would we do a similar thing with flattened
arrays…?

Basically, I'm saying that today's spec. should not say toArray
"stores a null", but rather "stores the default value of the
array element".  Which today is null, and tomorrow is something
perhaps more interesting.

Given that we don't control all implementations of List, we
can't upgrade all their toArray methods, which means we
have to either be weaselly or hack the JVM to convert
null to default (see Maurizio's suggestion).

Weasel words would be "if toArray is handed a flattened
array, the implementation may choose to throw NPE or
reset the array element to its flat initial value.  Implementors
are encouraged to do the latter.  All java.base implementations
do so."

>> ...
>> In the case of the List API it's more useful for the container, which is attempting
>> to contain all kinds of data, to bend a little and store T.default as the proper
>> generalization of null.  Under this theory, Object.default == null, and X.default
>> is also null for any non-value, non-primitive X.  (Including Integer but not int.)
> 
> Agreed, i just wanted to do the thought experiment given the current behavior of List/ArrayList as if it's unmodified legacy code.

Yup; see above.

> 
>> ...
>> (As I replied to Frederic, it is technically possible to imagine a system of
>> non-flat versions of VT[] co-existing with flat versions of VT[] but we shouldn't
>> do that just because we can, but because there is a proven need and not
>> doing it is even more costly than doing it.  There are good substitutes for
>> non-flat VT[], such as Object[] and I[] where VT <: I.  We can even contrive
>> to gain static typing for the substitutes, by using the ValueRef<VT> device.)
>> 
>>> since Arrays.copyOf operates reflectively on the argument’s class and not additional runtime properties. 
>> 
>> I don't get this.  What runtime properties are you thinking of?  ValueClasses?
>> That exists to give meaning to descriptors.  The actual Class mirror always
>> knows exactly whether it is a value class or not, and thus whether its arrays
>> are flat or not.
>> 
> 
> Ok, i was unsure about the class mirror, and whether there would be runtime associated with the array instance.

This is a big difference (as I noted to Frederic) between array elements and
object fields.  For array elements, we need global agreement on flattenability
(or nullability, or a polymorphic mix of both).  So there has to be some sort
of runtime tracking, either in the array type (preferably) or each instance
(yuck!) of whether flattening is happening.  Given the tracking, we can
ask whether any given array is flattenable or not (i.e., nullable or not).

(Note:  VM folks make a fine distinction between flattenability and
flattening, which non-VM folks can ignore.  A field or array element
is flattened when it really gets unbox and stored by value components.
A field or array element is flattenable when the JVM has tried but
failed to do so for some reason.  To provide deterministic behavior,
the JVM must keep this failure a secret from the user.  Thus, all
flattenable fields and elements reject nulls, even if they are
secretly not flattened, and thus their box pointer *could* be
replaced by a null.)

> 
> And just to be clear so i got this straight in my head...
> 
> ValueWorld
>> 
> value class Point {}
> 
> class A {
>  static void m() {
>    Point[] pa = new Point[10];
>    B.m1(pa); // returns false
>    B.m2(pa); // returns true
>  }
> }

Are you suggesting that new Point[10].getClass() != Point[].class
in some design we are considering?  That would be very surprising.

Anyway, in L-world it's simple like this:

Point[] pa = new Point[10]; // flattened array of 10 values
assert( pa.getClass() == Point[].class );
assert( Object[].class.isAssignableFrom( Point[].class ) );

In U-world, values and refs are under a distinct abstract
top type Q-X <: U-X && L-X <: U-X.  In such a design there
could be up to three array types for each X.  If we mix
into L-world a Q-descriptor, so that Q-X <: L-X, to capture
nullable vs. non-nullable descriptions of value types,
we would have two array types for each X.  That is
one way to have it both ways for arrays, a move which
I am resisting pending strong proof that it is really needed.

(And if we need polymorphic array types, it is still
possible to do it without introducing a new descriptor.
It could be a yucky per-instance bit, if we had no other
need for the expensive new descriptor.)

(In this I am assuming the there is a 1-1 correspondence
between field descriptors and Class objects.  That too
is something we could complexify, if it simplified something
else greatly.  You won't be surprised to know that I don't
want to do that move either unless it's forced.  Eventually
we will have to carefully introduce "crass" pointers when
we split a single class into many reified types.  And the
array distinction, if one exists, could be aligned with the
crass/species distinction.  We don't need to do it now,
though.)

> 
> RefWorld
>> 
> final class Point {} // note that the class is declared final
> 
> class A {
>  static boolean m1(Point[] p) {
>    return p.getClass() != Point[].class ;
>  }
> 
>  static boolean m2(Point[] p) {
>    return Point[].class.isAssignableFrom(p.getClass()); 
>  }
> 
> }

If Point is final, then m1 will never return true, right?
(True for both VBC Point and VT Point.)  I'm not sure
what you are saying here.  That VBCs and VTs, and
their arrays, look similar under such type tests?
I agree to that.  In L-world the covariant array typing
just works the same as always, for better or worse.


> And, for the same reasons, that also applies to the class mirror for Point in the value world and ref world.
> 
> Which got me thinking of the implications, if any, for checked collections :-) e.g. Collections.checkedList, which currently does:
> 
> E typeCheck(Object o) {
>    if (o != null && !type.isInstance(o))
>        throw new ClassCastException(badElementMsg(o));
>    return (E) o;
> }

So 'o != null' is true, for a couple of reasons, for any value instance o.
And then type.isInstance(o) will be true iff o is of the right type.
That means a null leaks down to the cast (E) even if E is a value
type.  And if the cast (E) is a generic, it will let the null through.
The null will only be detected and rejected downstream when
a client of the generic actually casts the null to a concrete
value type, not just a type parameter.

You left out the previous line!
  @SuppressWarnings("unchecked")

The above code should give a warning for an unchecked cast
on (E).  That's also a warning that the VT null checking may
fail.  The fix is to use Class.cast (if we decide that that will
reject nulls properly) or else a new API point Class.castValue
(which DTRT for value types w.r.t. null).

Fixed code:

E typeCheck(Object o) {
  if (o == null) {
    if (type.isValue())
      throw new NullPointerException(badElementMsg(o));
    return null;
  }
  if (!type.isInstance(o))
    throw new ClassCastException(badElementMsg(o));
  @SuppressWarnings("unchecked")
  E result = (E) o;
  return result;
}

or just:

E typeCheck(Object o) {
   return type.castValue(o);
}

Where in Class:

@HotSpotIntrinsicCandidate
public T castValue(Object obj) {
  if (obj == null) {
    if (isValue())
      throw new NullPointerException(cannotCastMsg(obj));
    return null;
  }
  if (!isInstance(obj))
    throw new ClassCastException(cannotCastMsg(obj));
  @SuppressWarnings("unchecked")
  T result = (T) obj;
  return result;
}

(I'm agnostic over whether we can sneak the null check
into Class::cast.  The current arguments about migrated
nullable VBCs to VTs will affect that decision.)

— John



More information about the valhalla-spec-observers mailing list