Alternative to IdentityObject & ValueObject interfaces
Dan Smith
daniel.smith at oracle.com
Thu Mar 24 02:51:15 UTC 2022
On Mar 22, 2022, at 5:56 PM, Dan Smith <daniel.smith at oracle.com<mailto:daniel.smith at oracle.com>> wrote:
- Variable types: I don't see a good way to get the equivalent of an 'IdentityObject' type. It would involve tracking the 'identity' property through the whole type system, which seems like a huge burden for the occasional "I'm not sure you can lock on that" error message. So we'd probably need to be okay letting that go. Fortunately, I'm not sure it's a great loss—lots of code today seems happy using 'Object' when it means, informally, "object that I've created for the sole purpose of locking".
- Type variable bounds: this one seems more achievable, by using the 'value' and 'identity' keywords to indicate a new kind of bounds check ('<identity T extends Runnable>'). Again, it's added complexity, but it's more localized. We should think more about the use cases, and decide if it passes the cost/benefit analysis. If not, nothing else depends on this, so it could be dropped. (Or left to a future, more general feature?)
Per today's discussion, this part seems to be the central question: how much value can we expect to get out of compile-time checking?
Stepping back from the type system details (that is, the below discussion applies whether we're using interfaces, modifiers on types, or something else), it's worth asking what errors we hope these features will help detect. We identified a couple of them today (and earlier in this thread):
- 'synchronized' on a value object
- storing a value object in a weak reference (in a world in which weak references don't support value objects)
Two questions:
1) What are the requirements for the analysis? How effective can it be?
The type system is going to have three kinds of types:
- types that guarantee identity objects
- types that guarantee value objects
- types that include both kinds of objects
That third kind are a problem: we can specify checks with false positives (programmer knows the operation is safe, compiler complains anyway) or false negatives (operation isn't safe, but the compiler lets it go).
For example, for the 'synchronized' operation, it's straightforward for the compiler to complain on a value class type. But what do we do with 'synchronized' on some interface type? Say we go the false positive route; the check probably looks like a warning ("you might be synchronizing on a value object"). In this case:
- We've just created a bunch of warnings in existing code that people will probably just @SuppressWarnings rather than try to address through the types, because changing the types (throughout the flow of data) is a lot of work and comes with compatibility risks.
- Even in totally new code, if I'm not working with a specific identity class, I'm not sure I would bother fiddling with the types to get better checking. It seems really tedious. (For example, changing an interface-typed parameter 'Foo' to intersection type 'Foo & IdentityObject'.)
If we prefer to allow false negatives, then it's straightforward: value class types get errors, other types do not. There's no need for extra type system features. (E.g., 'IdentityObject' and 'Object' get treated exactly the same by 'synchronized'.)
For weak references, it definitely doesn't make sense to reject types like WeakReference<Object>—that would be a compatibility mess. We could warn, but again, lots of false positive risk; and warnings don't generalize to general-purpose use of generics. I think again the best we could hope to do is to reject value class types. But something like 'T extends IdentityObject' doesn't accomplish this, because it excludes the "both kinds" types. Instead, we'd need something like 'T !extends ValueObject'.
2) Are these the best use cases we have? and are they really all that important?
These are the ones we've focused on, but maybe we can think of other applications. Other use cases would similarly have to involve the differences in runtime semantics.
Our two use cases share the property that they detect a runtime error (either an expression that we know will always throw, or with more aggressive checking an expression that *could* throw). That's helpful, but I do wonder how common such errors will be. We could do a bunch of type system work to detect division by zero, but nobody's asking for that because programmers just tend to avoid making that mistake already.
Synchronization: best practice is already to "own" the object being locked on, and that kind of knowledge isn't tracked by the type system. Doesn't seem that different for programmers to also be aware of whether their locking objects are identity objects without type system help.
Weak references: a WeakReference<ValueClass> seems like an unlikely scenario—why are you trying to manage GC for a value object? (Assuming we've provided an alternative API to manage references *within* value objects, do cacheing, etc.) So most runtime errors will fall into the WeakReference<Object> or WeakReference<Interface> category, and again there's a trade-off here between detecting real errors and reporting a bunch of false positives.
More information about the valhalla-spec-observers
mailing list