The interfaces IdentityObject and ValueObject must die !

Wed Jan 26 05:20:07 UTC 2022

> On Jan 25, 2022, at 2:39 PM, Remi Forax <forax at univ-mlv.fr> wrote:
> 
> I think we should revisit the idea of having the interfaces IdentityObject/ValueObject.
> 
> They serve two purposes
> 1/ documentation: explain the difference between an identity class and a value class

And, in particular, describe the differences in runtime behavior.

> 2/ type restriction: can be used as type or bound of type parameter for algorithms that only works with identity class
> 
> Sadly, our design as evolved but not those interfaces, they do not work well as type restriction, because
> the type is lost once an interface/j.l.Object is used and the cost of their introduction is higher than previously though.

Not sure there's a problem here. For example, if it's important to constrain type parameters to be identity classes, but there's already/also a need for an interface bound, nothing wrong with saying:

<T extends SomeInterface & IdentityObject>

If the *use site* may have lost track of the types it would need to satisfy the bound, then, sure, better to relax the bound. So, not useful in some use cases, useful in others.

Other purposes for these interfaces:

3/ dynamic tagging: support a runtime test to distinguish between value objects and identity objects

4/ subclass restriction: allow authors of abstract classes and interfaces to restrict implementations to only value classes or identity classes

5/ identity-dependent abstract classes: implicitly identifying, as part of an abstract class's API, that the class requires/assumes identity subclasses

> 1/ documentation
> 
> - Those interface split the possible types in two groups
>  but the spec split the types in 3, B1/B2/B3, thus they are not aligned anymore with the new design.

The identity class vs. value class distinction affects runtime semantics of class instances. Whether the class is declared as a value class or a primitive class, the instances get ValueObject semantics.

The primitive type vs. reference type distinction is a property of *variables* (and other type uses); the runtime semantics of class instances don't change between the two. Being a primitive class is statically interesting, because it means you can use it as a primitive type, but at runtime it's not really important.

In other words: I don't see a use case for distinguishing between primitive and value classes with different interfaces.

> - they will be noise in the future, for Valhalla, the separation between identity object and value object
>  may be important but from a POV of someone learning/discovering the language it's a corner case
>  (like primitives are). This is even more true with the introduction of B2, you can use B2 for a long time without knowing what a value type is. So having such interfaces front and center is too much.

I think it's notable that you get two different equality semantics—something you really ought to be aware of when working with a class. But it's a subjective call about how prominent that information should be.

> 2/ as type
> 
> - Being a value class or a primitive class is a runtime behavior not a compile time behavior,
>  so representing them with special types (a type is a compile time construct) will always be an approximation.
>  As a consequence, you can have an identity class (resp value class) typed with a type which is not a subtype
>  of IdentityObject (resp ValueObject).
> 
>  This discrepancy is hard to grasp for beginners (in fact not only for beginners) and make IdentityObject/ValueObject
>  useless because if a method takes an IdentityObject as parameter but the type is an interface, users will have
>  to cast it into an IdentityObject which defeat the purpose of having such interface.
> 
>  (This is the reason why ObjectOutputStream.writeObject() takes an Object as parameter and not Serializable)

It is sometimes possible/useful to statically identify this runtime property. At other times, it's not. That's not an argument for never representing the property with static types.

And even if you never pay attention to the static type, it turns out that the dynamic tagging and inheritance capabilities of interfaces are useful features for explaining/validating runtime behavior.

> And the cost of introduction is high
> 
> - they are not source backward compatible
>  class A {}
>  class B {}
>  var list = List.of(new A(), new B());
>  List<Object> list2 = list;

How about List<?>?

Yes, it's possible to disrupt inference, and touching *every class in existence* has the potential to be really disruptive. But we should validate that with some real-world code.

> - they are not binary backward compatible
>  new Object().getClass() != Object.class

This has nothing to do with the interfaces. This is based on the more general property that every class instance must belong to a class that is either an identity class or a value class. Interfaces are just the way we've chosen to encode that property. Abandoning the property entirely would be a bigger deal.

> - at least IdentityObject needs to be injected at runtime, which by side effect requires patching several
>  VM components: reflection/j.l.invoke, JVMTI, etc making the VM spec more complex that it should be.

Yes, there are some nice things about re-using an existing class feature rather than inventing a new one (we could have a new "class mode" or something); but it's a disadvantage that we then have to disrupt existing properties/behaviors.

Summarizing, yes, there are some areas of concern or caution. But this remains the best way we've identified so far to achieve a lot of different goals.