The interfaces IdentityObject and ValueObject must die !

Wed Jan 26 09:18:54 UTC 2022

----- Original Message -----
> From: "daniel smith" <daniel.smith at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Sent: Wednesday, January 26, 2022 6:20:07 AM
> Subject: Re: The interfaces IdentityObject and ValueObject must die !

>> On Jan 25, 2022, at 2:39 PM, Remi Forax <forax at univ-mlv.fr> wrote:
>> 
>> I think we should revisit the idea of having the interfaces
>> IdentityObject/ValueObject.
>> 
>> They serve two purposes
>> 1/ documentation: explain the difference between an identity class and a value
>> class
> 
> And, in particular, describe the differences in runtime behavior.

parts of the difference in runtime behavior, you still have the issue where to document the difference between a value class and a primitive class.

> 
>> 2/ type restriction: can be used as type or bound of type parameter for
>> algorithms that only works with identity class
>> 
>> Sadly, our design as evolved but not those interfaces, they do not work well as
>> type restriction, because
>> the type is lost once an interface/j.l.Object is used and the cost of their
>> introduction is higher than previously though.
> 
> 
> Not sure there's a problem here. For example, if it's important to constrain
> type parameters to be identity classes, but there's already/also a need for an
> interface bound, nothing wrong with saying:
> 
> <T extends SomeInterface & IdentityObject>
> 
> If the *use site* may have lost track of the types it would need to satisfy the
> bound, then, sure, better to relax the bound. So, not useful in some use cases,
> useful in others.
> 
> Other purposes for these interfaces:
> 
> 3/ dynamic tagging: support a runtime test to distinguish between value objects
> and identity objects

aClass.isIdentityClass()/isValueClass()/isPrimitiveClass() does the same

> 
> 4/ subclass restriction: allow authors of abstract classes and interfaces to
> restrict implementations to only value classes or identity classes
> 
> 5/ identity-dependent abstract classes: implicitly identifying, as part of an
> abstract class's API, that the class requires/assumes identity subclasses

both these examples are sub cases of using type restriction, so they are really under 2/.

> 
>> 1/ documentation
>> 
>> - Those interface split the possible types in two groups
>>  but the spec split the types in 3, B1/B2/B3, thus they are not aligned anymore
>>  with the new design.
> 
> The identity class vs. value class distinction affects runtime semantics of
> class instances. Whether the class is declared as a value class or a primitive
> class, the instances get ValueObject semantics.
> 
> The primitive type vs. reference type distinction is a property of *variables*
> (and other type uses); the runtime semantics of class instances don't change
> between the two. Being a primitive class is statically interesting, because it
> means you can use it as a primitive type, but at runtime it's not really
> important.
> 
> In other words: I don't see a use case for distinguishing between primitive and
> value classes with different interfaces.

Primitive classes does not allow nulls and are tearable, following your logic, there should be a subclass of ValueObject named PrimitiveObject that reflects that semantics.

This is especially useful when you have an array of PrimitiveObject, you know that a storing null in an array of PrimitiveObject will always generate a NPE at runtime and that you may have to use either the volatile semantics or a lock when you read/write values from/to the array of PrimitiveObject.

For examples,
  public void m(PrimitiveObject[] array, int index) {
    array[index] = null;  // can be a compile time error
  }

  public void swap(PrimitiveObject[] array, int i, int j) { // non tearable swap
    synchronized(lock) {
      var tmp = array[i];
      array[i] = array[j];
      array[j] = tmp;
    }
  }

  public class NullableArray<T extends PrimitiveObject> ( // specialized generics
    private final boolean[] nullables;  // use a side array to represent nulls
    private final T[] array;
    ...

  }

Given that the model B1/B2/B3 has separated the runtime behaviors of the value class into 2 sub categories, I don't understand why the semantics of ==, synchronized() and weak reference of B2 is more important than the nullability and tearability of B3.

> 
>> - they will be noise in the future, for Valhalla, the separation between
>> identity object and value object
>>  may be important but from a POV of someone learning/discovering the language
>>  it's a corner case
>>  (like primitives are). This is even more true with the introduction of B2, you
>>  can use B2 for a long time without knowing what a value type is. So having such
>>  interfaces front and center is too much.
> 
> I think it's notable that you get two different equality semantics—something you
> really ought to be aware of when working with a class. But it's a subjective
> call about how prominent that information should be.

Most people don't call == on Object nor use synchronized on an object they do not control.

> 
>> 2/ as type
>> 
>> - Being a value class or a primitive class is a runtime behavior not a compile
>> time behavior,
>>  so representing them with special types (a type is a compile time construct)
>>  will always be an approximation.
>>  As a consequence, you can have an identity class (resp value class) typed with a
>>  type which is not a subtype
>>  of IdentityObject (resp ValueObject).
>> 
>>  This discrepancy is hard to grasp for beginners (in fact not only for beginners)
>>  and make IdentityObject/ValueObject
>>  useless because if a method takes an IdentityObject as parameter but the type is
>>  an interface, users will have
>>  to cast it into an IdentityObject which defeat the purpose of having such
>>  interface.
>> 
>>  (This is the reason why ObjectOutputStream.writeObject() takes an Object as
>>  parameter and not Serializable)
> 
> It is sometimes possible/useful to statically identify this runtime property. At
> other times, it's not. That's not an argument for never representing the
> property with static types.
> 
> And even if you never pay attention to the static type, it turns out that the
> dynamic tagging and inheritance capabilities of interfaces are useful features
> for explaining/validating runtime behavior.

Inheritance capabilities of interfaces does not work for Object and also as a side effect creates impossible types (i've forgotten that one).

An impossible type, it's a type that can be declared but no class will ever match.

Examples of impossible types, at declaration site
  interface I extends ValueObject {}
  interface J extends IdentityObject {}
  <T extends I & J> void foo() { }

> 
>> And the cost of introduction is high
>> 
>> - they are not source backward compatible
>>  class A {}
>>  class B {}
>>  var list = List.of(new A(), new B());
>>  List<Object> list2 = list;
> 
> How about List<?>?
> 
> Yes, it's possible to disrupt inference, and touching *every class in existence*
> has the potential to be really disruptive. But we should validate that with
> some real-world code.

All we need is the inference to compute the lowest upper bound of two or more types and no explicit type for the result.
So at least all codes that using a method that takes a T... (Arrays.asList(), List.of(), Stream.of()) that either stores the result in a var or acts as a monad (think stream.map()/flatMap() etc) will produce a different type.

> 
>> - they are not binary backward compatible
>>  new Object().getClass() != Object.class
> 
> This has nothing to do with the interfaces. This is based on the more general
> property that every class instance must belong to a class that is either an
> identity class or a value class. Interfaces are just the way we've chosen to
> encode that property. Abandoning the property entirely would be a bigger deal.

If we do not use interfaces, the runtime class of java.lang.Object can be Object, being an identity class or not is a just a bit in the reified class, not a compile time property, there is contamination by inheritance.

> 
>> - at least IdentityObject needs to be injected at runtime, which by side effect
>> requires patching several
>>  VM components: reflection/j.l.invoke, JVMTI, etc making the VM spec more complex
>>  that it should be.
> 
> Yes, there are some nice things about re-using an existing class feature rather
> than inventing a new one (we could have a new "class mode" or something); but
> it's a disadvantage that we then have to disrupt existing properties/behaviors.
> 
> Summarizing, yes, there are some areas of concern or caution. But this remains
> the best way we've identified so far to achieve a lot of different goals.

I fully disagree, there is no goal among documentation, type restriction, dynamic checking, where using two interfaces ValueObject and IdentityObject fully fullfill its duty *and* it's wildly not backward compatible thus requires heroic tweaks (dynamic interface injection, bytecode rewriting of new Object, reflection type filtering, type hiding during inference etc).

For me, it's like opening the door of your house to an elephant because it has a nice hat and saying you will fix that with scotch-tape each time it touches something. 

Rémi