Alternative to IdentityObject & ValueObject interfaces

Wed Mar 23 12:01:20 UTC 2022

Thanks Dan for putting the work in to provide a credible alternative.

Let me add some background for how we came up with these things.  At 
some point we asked ourselves, what if we had identity and value classes 
from day 1?  How would that affect the object model?  And we concluded 
at the time that we probably wouldn't want the identity-indeterminacy of 
Object, but instead would want something like

     abstract class Object
     class IdentityObject extends Object { }
     abstract class ValueObject extends Object { }

So the {Identity,Value}Object interfaces seemed valuable pedagogically, 
in that they make the object hierarchy reflect the language division.  
At the time, we imagined there might be methods that apply to all value 
objects, that could live in ValueObject.

A separate factor is that we were taking operations that were previously 
total (locking, weak refs) and making them partial. This is scary!  So 
we wanted a way to make these expressible in the static type system.

Unfortunately, the interfaces do not really deliver on either goal, 
because we can't turn back time.  We still have to deal with `new 
Object()`, so we can't (yet) make Object abstract. Many signatures will 
not be changeable from "Object" to "IdentityObject" for reasons of 
compatibility, unless we make IdentityObject erase to Object (which has 
its own problems.)  If people use it at all for type bounds, we'll see 
lots of uses of `Foo<? extends Bar&IdentityObject>`, which will put more 
pressure on our weak support for intersection types.  And dynamic errors 
will still happen, because too much of the world was built using 
signatures that don't express identity-ness. (Kevin will see a parallel 
to introducing nullness annotations; it might be fine if you build the 
world that way from scratch, but the transition is painful when you have 
to interpret an unadorned type as "of unspecified identity-ness.")

Several years on, we're still leaning on the same few motivating 
examples -- capturing things like "I might lock this" in the type 
system.  That we haven't come up with more killer examples is notable.  
And I grow increasingly skeptical of the value of the locking example, 
both because this is not how concurrent code is written, and because we 
*still* have to deal with the risk of dynamic errors because most of the 
world's code has not been (and will not be) written to use 
IdentityObject throughout.

As Dan points out, the main thing we give up by backing off from these 
interfaces is the static typing; we don't get to use `IdentityObject` as 
a parameter type, return type, or type bound.  And the only reason we've 
come up with so far to want that is a pretty lame one -- locking.

 From a language design perspective, I find that you declare a class 
with `value class`, but you express the subclassing constraint with 
`extends IdentityObject`, to be pretty leaky.

On 3/22/2022 7:56 PM, Dan Smith wrote:
> In response to some encouragement from Remi, John, and others, I've decided to take a closer look at how we might approach the categorization of value and identity classes without relying on the IdentityObject and ValueObject interfaces.
>
> (For background, see the thread "The interfaces IdentityObject and ValueObject must die" in January.)
>
> These interfaces have found a number of different uses (enumerated below), while mostly leaning on the existing functionality of interfaces, so there's a pretty good complexity vs. benefit trade-off. But their use has some rough edges, and inserting them everywhere has a nontrivial compatibility impact. Can we do better?
>
> Language proposal:
>
> - A "value class" is any class whose instances are all value objects. An "identity class" is any class whose instances are all identity objects. Abstract classes can be value classes or identity classes, or neither. Interfaces can be "value interfaces" or "identity interfaces", or neither.
>
> - A class/interface can be designated a value class with the 'value' modifier.
>
> value class Foo {}
> abstract value class Bar {}
> value interface Baz {}
> value record Rec(int x) {}
>
> A class/interface can be designated an identity class with the 'identity' modifier.
>
> identity class Foo {}
> abstract identity class Bar {}
> identity interface Baz {}
> identity record Rec(int x) {}
>
> - Concrete classes with neither modifier are implicitly 'identity'; abstract classes with neither modifier, but with certain identity-dependent features (instance fields, initializers, synchronized methods, ...) are implicitly 'identity' (possibly with a warning). Other abstract classes and interfaces are fine being neither (thus supporting both kinds of subclasses).
>
> - The properties are inherited: if you extend a value class/interface, you are a value/class interface. (Same for identity classes/interfaces.) It's an error to be both.
>
> - The usual restrictions apply to value classes, both concrete and abstract; and also to "neither" abstract classes, if they haven't been implicitly made 'identity'.
>
> - An API ('Object.isValueObject()'?) allows for dynamically distinguishing between value objects and identity objects. The reflection API (in java.lang.Class) allows for detection of value classes/interfaces, identity classes/interfaces, and "neither" classes/interfaces.
>
> - TBD whether/how we track these properties statically so that the type system catch mismatches between non-identity class types and uses that assume identity.
>
> JVM proposal:
>
> - Same conceptual framework.
>
> - Classes can be ACC_VALUE, ACC_IDENTITY, or neither.
>
> - Legacy-version classes are implicitly ACC_IDENTITY. Legacy interfaces are not. Optionally, modern-version concrete classes are also implicitly ACC_IDENTITY.
>
> (Trying out this alternative approach to abstract classes: there's no more ACC_PERMITS_VALUE; instead, legacy-version abstract classes are automatically ACC_IDENTITY, and modern-version abstract classes permit value subclasses unless they opt out with ACC_IDENTITY. It's the bytecode generator's responsibility to set these flags appropriately. Conceptually cleaner, maybe too risky...)
>
> - At class load time, we inherit value/identity-ness and check for conflicts. It's okay to have neither flag set but inherit the property from one of your supers. We also enforce constraints on value classes and "neither" abstract classes.
>
> ---
>
> So how does this score as a replacement for the list of features enabled by the interfaces?
>
> - Dynamic detection: 'obj instanceof ValueObject' is quite straightforward; if we can replace that with 'obj.isValueObject()', that feels about equally useful. (I'd be more pessimistic about something like 'Objects.isValueObject(obj)'.)
>
> - Subclass restriction: 'implements IdentityObject' has been replaced with the 'identity' modifier. Complexity cost of special modifiers seems on par with the complexity of special rules for inferring and checking the superinterfaces. I think it's a win that we use the 'value' modifier and "value" terminology for all kinds of classes/interfaces, not just concrete classes.
>
> - Variable types: I don't see a good way to get the equivalent of an 'IdentityObject' type. It would involve tracking the 'identity' property through the whole type system, which seems like a huge burden for the occasional "I'm not sure you can lock on that" error message. So we'd probably need to be okay letting that go. Fortunately, I'm not sure it's a great loss—lots of code today seems happy using 'Object' when it means, informally, "object that I've created for the sole purpose of locking".
>
> - Type variable bounds: this one seems more achievable, by using the 'value' and 'identity' keywords to indicate a new kind of bounds check ('<identity T extends Runnable>'). Again, it's added complexity, but it's more localized. We should think more about the use cases, and decide if it passes the cost/benefit analysis. If not, nothing else depends on this, so it could be dropped. (Or left to a future, more general feature?)
>
> - Documentation: we've lost the handy javadoc location to put some explanations about identity & value objects in a place that curious programmers can easily stumble on. Anything we want to say needs to go in JLS/JVMS (or perhaps the java.lang.Object javadoc).
>
> - Compatibility: pretty clear win here. No interface injection means tools that depend on reflection results won't be broken. (We've found a significant number of these problems in our own code/tests, FWIW.) No new static types means inference results won't change. There's less risk of incompatibilities when adding/removing the 'identity' and 'value' keywords (although there can still be source, binary, and behavioral incompatibilities).
>