From kevinb at google.com Tue Mar 1 13:56:31 2022 From: kevinb at google.com (Kevin Bourrillion) Date: Tue, 1 Mar 2022 05:56:31 -0800 Subject: Evolving instance creation In-Reply-To: <1DB20A74-B22C-4EB0-AB70-0949A51C365E@oracle.com> References: <1DB20A74-B22C-4EB0-AB70-0949A51C365E@oracle.com> Message-ID: Seems like this decision is trending in the direction I'd prefer already, but here's some argumentation that *might* be helpful from the programming-model perspective. On Tue, Feb 22, 2022 at 1:17 PM Dan Smith wrote: One of the longstanding properties of class instance creation expressions > ('new Foo()') is that the instance being produced is unique?that is, not > '==' to any previously-created instance. > I'll argue that this is an incidental association only. Note that `new` has simply never been *needed* for identityless types before; for those (all 8 of them), literals and binary expressions and things had us covered. So `new` has so far seemed associated with identityful types. But I think the expectation quoted here clearly comes from the identity-type-ness, not from the `new` keyword. If we use `new` with identityless objects or values, a distinct-identity expectation simply doesn't apply. Plus as Remi says, changing to that type changes `==` into a different operator with the same name. So I think that this: new Point(1, 2) == new Point(1, 2) // always true > .... is entirely *un*problematic! Dan H. says, "`new` carries the mental model of allocating space" -- again, I think it's incidental. Because the *point of introducing *identityless types is that the distinction between creating and reusing (summoning from the either somehow) vanishes. We shouldn't be able to distinguish those cases. As Dan S. says later, I'd rather have programmers think in these terms: when you instantiate a > value class, you might get an object that already exists. Whether there are > copies of that object at different memory locations or not is > irrelevant?it's still *the same object*. But my reaction is: then to the programming model it *might as well just look like creation*. It can't really "look like reusing" without forcing the question of "reusing what from where?". It can only just look *different*. But I don't think it needs to. (I think this is what Dan H ends up supporting too.) The main thing I think CICEs/`new` accomplish is simply to "cross the bridge". Constructors are void and non-static; yet somehow we need to *call* them as if they're static and non-void! `new` gets us across that gap. This seems to me like a special-snowflake problem that `new` is custom built to address, and I would hope we keep it. Seen this way, it's essential that this bridge-crossing happens *somewhere*, but it doesn't necessarily mean constructors need to be spiffy public API. It could be a dirty secret we hide within our static factory methods. And we often do this, because public constructors *aren't* spiffy; they can't have names, relying purely on argument types and order to disambiguate, and they weirdly promise the caller never to return a subtype even though a caller should never even care about that (because substitutability principle). Here are three approaches that I could imagine pursuing: > > (1) Value classes are a special case for 'new Foo()' > > This is the plan of record: the unique instance invariant continues to > hold for 'new Foo()' where Foo is an identity class, but if Foo is a value > class, you might get an existing instance. > (I've argued above it's not even a special case.) Biggest concerns: for now, it can be surprising that 'new' doesn't always > give you a unique instance. This is the best kind of surprise! Because grappling with it points you directly toward understanding what identityless classes are all about. A couple more minor points about the factories idea: > A related, possibly-overlapping new Java feature idea (not concretely > proposed, but something the language might want in the future) is the > declaration of canonical factory methods in a class, which intentionally > *don't* promise unique instances (for example, they might implement > interning). These factories would be like constructors in that they > wouldn't have a unique method name, but otherwise would behave like ad hoc > static factory methods?take some arguments, use them to create/locate an > appropriate instance, return it. > Can you clarify what these offer that static methods don't already provide? The two weaknesses I'm aware of with static factory methods are (1) subclasses still need a constructor to call and (2) often you don't really want the burden of naming them, you just want them to look like the obvious standard creation path. It sounds like this addresses (2) but not (1), and I assume also addresses some (3). > (2) 'new Foo()' as a general-purpose creation tool > > In this approach, 'new Foo()' is the use-site syntax for *both* factory > and constructor invocation. Factories and constructors live in the same > overload resolution "namespace", and all will be considered by the use site. > It sounds to me like these factories would be static, so `new` would not be required by the "cross the bridge" interpretation given above. On Thu, Feb 24, 2022 at 7:40 AM Dan Heidinga wrote: The rest of this is more of a language design question than a VM one. > The `Foo()` (without a new) is a good starting point for a canonical > factory model. It's been mentioned somewhere in all this that there *can* be a static method with that signature in the class. Evil to actually do that, yes. But the mere fact it's possible makes this syntax a bit confusing imho. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From daniel.smith at oracle.com Tue Mar 1 20:34:02 2022 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 1 Mar 2022 20:34:02 +0000 Subject: Evolving instance creation In-Reply-To: References: <1DB20A74-B22C-4EB0-AB70-0949A51C365E@oracle.com> Message-ID: <3B05D93E-9B07-4679-B281-D4CFFF688F13@oracle.com> On Mar 1, 2022, at 6:56 AM, Kevin Bourrillion > wrote: The main thing I think CICEs/`new` accomplish is simply to "cross the bridge". Constructors are void and non-static; yet somehow we need to call them as if they're static and non-void! `new` gets us across that gap. This seems to me like a special-snowflake problem that `new` is custom built to address, and I would hope we keep it. Okay. So support for 'new Point()' (1) over just 'Point()' (3) on the basis that constructor declarations need special magic to enter the context where the constructor body lives. So as long as we're declaring value class constructors in the same way as identity class constructors, it makes sense for both to have the same invocation syntax, and for that syntax to be somehow different from a method invocation. I suppose (3) envisions this magic happening invisibly, as part of the instantiation API provided by the class?there's some magic under the covers where a bridge/factory-like entity gets invoked and sets up the context for the constructor body. But I agree that it's probably better not to have to appeal to something invisible when people are already used to the magic being explicit. A couple more minor points about the factories idea: A related, possibly-overlapping new Java feature idea (not concretely proposed, but something the language might want in the future) is the declaration of canonical factory methods in a class, which intentionally *don't* promise unique instances (for example, they might implement interning). These factories would be like constructors in that they wouldn't have a unique method name, but otherwise would behave like ad hoc static factory methods?take some arguments, use them to create/locate an appropriate instance, return it. Can you clarify what these offer that static methods don't already provide? The two weaknesses I'm aware of with static factory methods are (1) subclasses still need a constructor to call and (2) often you don't really want the burden of naming them, you just want them to look like the obvious standard creation path. It sounds like this addresses (2) but not (1), and I assume also addresses some (3). A couple of things: - If it's canonical, everybody knows where to find it. APIs like reflection and tools like serialization can create instances through a universally-recognized mechanism (but one that is more flexible than constructors). - In a similar vein, if JVMS can count on instantiation being supported by a canonical method name, then this approach can subsume existing uses of 'new/dup/', which are a major source of complexity. This is a very long game, but the idea is that eventually the old mechanism (specifically, use of the 'new' bytecode outside of the class being instantiated) could be deprecated. (2) 'new Foo()' as a general-purpose creation tool In this approach, 'new Foo()' is the use-site syntax for *both* factory and constructor invocation. Factories and constructors live in the same overload resolution "namespace", and all will be considered by the use site. It sounds to me like these factories would be static, so `new` would not be required by the "cross the bridge" interpretation given above. Right. This approach gives up the use-site/declaration-site alignment, instead interpreting 'new' as "make me one of these, using whatever mechanism the class provides". From daniel.smith at oracle.com Tue Mar 8 23:19:16 2022 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 8 Mar 2022 23:19:16 +0000 Subject: EG meeting *canceled*, 2022-03-09 Message-ID: <587E58E1-F451-4807-95C1-8A102BB6A694@oracle.com> Only list traffic since last meeting is a couple of followups to that discussion, so I think we can skip this time. Next meeting March 23. From daniel.smith at oracle.com Tue Mar 22 23:56:02 2022 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 22 Mar 2022 23:56:02 +0000 Subject: Alternative to IdentityObject & ValueObject interfaces Message-ID: In response to some encouragement from Remi, John, and others, I've decided to take a closer look at how we might approach the categorization of value and identity classes without relying on the IdentityObject and ValueObject interfaces. (For background, see the thread "The interfaces IdentityObject and ValueObject must die" in January.) These interfaces have found a number of different uses (enumerated below), while mostly leaning on the existing functionality of interfaces, so there's a pretty good complexity vs. benefit trade-off. But their use has some rough edges, and inserting them everywhere has a nontrivial compatibility impact. Can we do better? Language proposal: - A "value class" is any class whose instances are all value objects. An "identity class" is any class whose instances are all identity objects. Abstract classes can be value classes or identity classes, or neither. Interfaces can be "value interfaces" or "identity interfaces", or neither. - A class/interface can be designated a value class with the 'value' modifier. value class Foo {} abstract value class Bar {} value interface Baz {} value record Rec(int x) {} A class/interface can be designated an identity class with the 'identity' modifier. identity class Foo {} abstract identity class Bar {} identity interface Baz {} identity record Rec(int x) {} - Concrete classes with neither modifier are implicitly 'identity'; abstract classes with neither modifier, but with certain identity-dependent features (instance fields, initializers, synchronized methods, ...) are implicitly 'identity' (possibly with a warning). Other abstract classes and interfaces are fine being neither (thus supporting both kinds of subclasses). - The properties are inherited: if you extend a value class/interface, you are a value/class interface. (Same for identity classes/interfaces.) It's an error to be both. - The usual restrictions apply to value classes, both concrete and abstract; and also to "neither" abstract classes, if they haven't been implicitly made 'identity'. - An API ('Object.isValueObject()'?) allows for dynamically distinguishing between value objects and identity objects. The reflection API (in java.lang.Class) allows for detection of value classes/interfaces, identity classes/interfaces, and "neither" classes/interfaces. - TBD whether/how we track these properties statically so that the type system catch mismatches between non-identity class types and uses that assume identity. JVM proposal: - Same conceptual framework. - Classes can be ACC_VALUE, ACC_IDENTITY, or neither. - Legacy-version classes are implicitly ACC_IDENTITY. Legacy interfaces are not. Optionally, modern-version concrete classes are also implicitly ACC_IDENTITY. (Trying out this alternative approach to abstract classes: there's no more ACC_PERMITS_VALUE; instead, legacy-version abstract classes are automatically ACC_IDENTITY, and modern-version abstract classes permit value subclasses unless they opt out with ACC_IDENTITY. It's the bytecode generator's responsibility to set these flags appropriately. Conceptually cleaner, maybe too risky...) - At class load time, we inherit value/identity-ness and check for conflicts. It's okay to have neither flag set but inherit the property from one of your supers. We also enforce constraints on value classes and "neither" abstract classes. --- So how does this score as a replacement for the list of features enabled by the interfaces? - Dynamic detection: 'obj instanceof ValueObject' is quite straightforward; if we can replace that with 'obj.isValueObject()', that feels about equally useful. (I'd be more pessimistic about something like 'Objects.isValueObject(obj)'.) - Subclass restriction: 'implements IdentityObject' has been replaced with the 'identity' modifier. Complexity cost of special modifiers seems on par with the complexity of special rules for inferring and checking the superinterfaces. I think it's a win that we use the 'value' modifier and "value" terminology for all kinds of classes/interfaces, not just concrete classes. - Variable types: I don't see a good way to get the equivalent of an 'IdentityObject' type. It would involve tracking the 'identity' property through the whole type system, which seems like a huge burden for the occasional "I'm not sure you can lock on that" error message. So we'd probably need to be okay letting that go. Fortunately, I'm not sure it's a great loss?lots of code today seems happy using 'Object' when it means, informally, "object that I've created for the sole purpose of locking". - Type variable bounds: this one seems more achievable, by using the 'value' and 'identity' keywords to indicate a new kind of bounds check (''). Again, it's added complexity, but it's more localized. We should think more about the use cases, and decide if it passes the cost/benefit analysis. If not, nothing else depends on this, so it could be dropped. (Or left to a future, more general feature?) - Documentation: we've lost the handy javadoc location to put some explanations about identity & value objects in a place that curious programmers can easily stumble on. Anything we want to say needs to go in JLS/JVMS (or perhaps the java.lang.Object javadoc). - Compatibility: pretty clear win here. No interface injection means tools that depend on reflection results won't be broken. (We've found a significant number of these problems in our own code/tests, FWIW.) No new static types means inference results won't change. There's less risk of incompatibilities when adding/removing the 'identity' and 'value' keywords (although there can still be source, binary, and behavioral incompatibilities). From kevinb at google.com Wed Mar 23 01:44:40 2022 From: kevinb at google.com (Kevin Bourrillion) Date: Tue, 22 Mar 2022 18:44:40 -0700 Subject: Alternative to IdentityObject & ValueObject interfaces In-Reply-To: References: Message-ID: On Tue, Mar 22, 2022 at 4:56 PM Dan Smith wrote: In response to some encouragement from Remi, John, and others, I've decided > to take a closer look at how we might approach the categorization of value > and identity classes without relying on the IdentityObject and ValueObject > interfaces. > > (For background, see the thread "The interfaces IdentityObject and > ValueObject must die" in January.) > Could anyone summarize the strongest version of the argument against them? The thread is not too easy to follow. - A "value class" is any class whose instances are all value objects. An > "identity class" is any class whose instances are all identity objects. I assume you are contrasting "bucket 1" vs. "buckets 2+3" here. (My own chosen nomenclature would only alter it slightly to say that value classes *also* have instances that are pure values, no object in sight.) - Subclass restriction: 'implements IdentityObject' has been replaced with > the 'identity' modifier. Complexity cost of special modifiers seems on par > with the complexity of special rules for inferring and checking the > superinterfaces. The rules for the modifiers are okay. But here's my observation. The simplest way to explain those rules would be if the `value` keyword is literally shorthand for `extends/implements ValueObject`. I think the rules fall out from that, plus: - IO and VO are disjoint. (As interfaces can *already* be, like `interface Foo { int x(); }` and `interface Bar { boolean x(); }`, and if it really came down to it, you could literally put an incompatible method into each type and blame their noncohabitation on that :-)) - A class that breaks the value class rules has committed to being an identity class. - We wouldn't know how to make an *instance* that is "neither", so *instantiating* a "neither" class has to have default behavior, and that has to be to give you what it always has. In each case I've explained why the rule seems very easy to understand to me. So from my POV, this *still* pulls me back to the types anyway. I would say that your rules for the modifiers are largely *simulating* those types. I think it's a win that we use the 'value' modifier and "value" terminology > for all kinds of classes/interfaces, not just concrete classes. > I think I've probably come around to that terminology over the long course of reediting this email. - Variable types: I don't see a good way to get the equivalent of an > 'IdentityObject' type. It would involve tracking the 'identity' property > through the whole type system, which seems like a huge burden for the > occasional "I'm not sure you can lock on that" error message. So we'd > probably need to be okay letting that go. Fortunately, I'm not sure it's a > great loss?lots of code today seems happy using 'Object' when it means, > informally, "object that I've created for the sole purpose of locking". > I'm confused, because it seems like we'd be throwing out an awful lot here. If I pass a value object to `identityHashCode` we'd rather that didn't compile. Seems like this list goes on a long way. - Type variable bounds: this one seems more achievable, by using the > 'value' and 'identity' keywords to indicate a new kind of bounds check > (''). Again, it's added complexity, but it's > more localized. We should think more about the use cases, and decide if it > passes the cost/benefit analysis. If not, nothing else depends on this, so > it could be dropped. (Or left to a future, more general feature?) > Don't we already need `Foo` though? Adding this too seems *super* confusing to me. Let types do what types already do. > - Documentation: we've lost the handy javadoc location to put some > explanations about identity & value objects in a place that curious > programmers can easily stumble on. Anything we want to say needs to go in > JLS/JVMS (or perhaps the java.lang.Object javadoc). > Going beyond "mere" documentation: capturing capabilities and constraints is precisely what types are *for*. Isn't being able to determine those behaviors from types the reason people choose a strongly typed language in the first place? - Compatibility: pretty clear win here. No interface injection means tools > that depend on reflection results won't be broken. (We've found a > significant number of these problems in our own code/tests, FWIW.) No new > static types means inference results won't change. > Seems less "breaking compatibility" than just "Hyrum's Law". But I lack understanding of how widespread or hard to fix these problems are. We could maybe do an experiment over here if necessary. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From kevinb at google.com Wed Mar 23 04:42:53 2022 From: kevinb at google.com (Kevin Bourrillion) Date: Tue, 22 Mar 2022 21:42:53 -0700 Subject: The interfaces IdentityObject and ValueObject must die ! In-Reply-To: References: <1984046756.4249884.1643146765320.JavaMail.zimbra@u-pem.fr> <94716CB6-B117-42DE-91F5-77B33485C514@oracle.com> <547491109.4463966.1643188734104.JavaMail.zimbra@u-pem.fr> <059B1AC6-683E-424E-8CCB-E29AB0BCBB6B@oracle.com> <5079388C-93D2-42EA-BA99-F8312ABF26B9@oracle.com> Message-ID: On Wed, Jan 26, 2022 at 5:14 PM John Rose wrote: > A. I am aiming for new Object().getClass() == Object.class. > That you can do `new Object()` at all looks like the proverbial bathwater here. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From daniel.smith at oracle.com Wed Mar 23 04:52:25 2022 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 23 Mar 2022 04:52:25 +0000 Subject: Alternative to IdentityObject & ValueObject interfaces In-Reply-To: References: Message-ID: On Mar 22, 2022, at 7:21 PM, Dan Heidinga > wrote: A couple of comments on the encoding and questions related to descriptors. JVM proposal: - Same conceptual framework. - Classes can be ACC_VALUE, ACC_IDENTITY, or neither. - Legacy-version classes are implicitly ACC_IDENTITY. Legacy interfaces are not. Optionally, modern-version concrete classes are also implicitly ACC_IDENTITY. Maybe this is too clever, but if we added ACC_VALUE and ACC_NEITHER bits, then any class without one of the bits set (including all the legacy classes) are identity classes. (Trying out this alternative approach to abstract classes: there's no more ACC_PERMITS_VALUE; instead, legacy-version abstract classes are automatically ACC_IDENTITY, and modern-version abstract classes permit value subclasses unless they opt out with ACC_IDENTITY. It's the bytecode generator's responsibility to set these flags appropriately. Conceptually cleaner, maybe too risky...) With the "clever" encoding, every class is implicitly identity unless it sets ACC_VALUE or ACC_NEITHER and bytecode generators have to explicitly flag modern abstract classes. This is kind of growing on me. A problem is that interfaces are ACC_NEITHER by default, not ACC_IDENTITY. Abstract classes and interfaces have to get two different behaviors based on the same 0 bits. Here's another more stable encoding, though, that feels less fiddly to me than what I originally wrote: ACC_VALUE means "allows value object instances" ACC_IDENTITY means "allows identity object instances" If you set *both*, you're a "neither" class/interface. (That is, you allow both kinds of instances.) If you set *none*, you get the default/legacy behavior implicitly: classes are ACC_IDENTITY only, interfaces are ACC_IDENTITY & ACC_VALUE. From daniel.smith at oracle.com Wed Mar 23 05:34:11 2022 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 23 Mar 2022 05:34:11 +0000 Subject: Alternative to IdentityObject & ValueObject interfaces In-Reply-To: References: Message-ID: <67EC9E3F-5399-47B6-B297-5F413823F5E3@oracle.com> On Mar 22, 2022, at 7:44 PM, Kevin Bourrillion > wrote: On Tue, Mar 22, 2022 at 4:56 PM Dan Smith > wrote: In response to some encouragement from Remi, John, and others, I've decided to take a closer look at how we might approach the categorization of value and identity classes without relying on the IdentityObject and ValueObject interfaces. (For background, see the thread "The interfaces IdentityObject and ValueObject must die" in January.) Could anyone summarize the strongest version of the argument against them? The thread is not too easy to follow. I'm sure there's more, but here's my sense of the notable problems with the status quo approach: - We're adding a marker interface to every concrete class in the Java universe. Generally, an extra marker interface shouldn't affect anything, but the Java universe is big, and we're bound to break some things (specifically by changing reflection behavior and by producing more compile-time intersection types). We can ask people to fix their code and make fewer assumptions, but it adds upgrade friction, and the budget for breaking stuff is not unlimited. - Injecting superinterfaces is something entirely new that I think JVMs would really rather not be involved with. But it's necessary for compatibly evolving class files. We've spent a surprising amount of time working out exactly when the interfaces should be injected; separate compilation leads to tricky corner cases. - There's a tension between our use of modifiers and our use of interfaces. We've made some ad hoc choices about which are used in which places (e.g., you can't declare a concrete value class by saying 'class Foo implements ValueObject'). In the JVM, we need modifiers for format checking and interfaces for types. This is all okay, but the arbitrariness and redundancy of it is unsatisfying and suggests there might be some accidental complexity. - Subclass restriction: 'implements IdentityObject' has been replaced with the 'identity' modifier. Complexity cost of special modifiers seems on par with the complexity of special rules for inferring and checking the superinterfaces. The rules for the modifiers are okay. But here's my observation. The simplest way to explain those rules would be if the `value` keyword is literally shorthand for `extends/implements ValueObject`. I think the rules fall out from that, plus: * IO and VO are disjoint. (As interfaces can already be, like `interface Foo { int x(); }` and `interface Bar { boolean x(); }`, and if it really came down to it, you could literally put an incompatible method into each type and blame their noncohabitation on that :-)) * A class that breaks the value class rules has committed to being an identity class. * We wouldn't know how to make an instance that is "neither", so instantiating a "neither" class has to have default behavior, and that has to be to give you what it always has. In each case I've explained why the rule seems very easy to understand to me. So from my POV, this still pulls me back to the types anyway. I would say that your rules for the modifiers are largely simulating those types. Yes, it is nice how we get inheritance for free from interfaces. But when you compare that with the "plus" list (which I'd summarize as: disjointedness, declaration restrictions, and inference), it's not like getting inheritance "for free" is such a huge win. It's maybe 20% less complexity or something to explain the feature. Of course the big win is that interfaces are types, so we already know how to use them in the static type system. As your later comments suggest, I think our expectations for static typing are probably the most important factor in deciding which strategy best meets our needs. From maurizio.cimadamore at oracle.com Wed Mar 23 10:23:26 2022 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 23 Mar 2022 10:23:26 +0000 Subject: Alternative to IdentityObject & ValueObject interfaces In-Reply-To: References: Message-ID: <58d81e08-5fc1-bb3e-3710-adac2c14dc2d@oracle.com> On 22/03/2022 23:56, Dan Smith wrote: > Other abstract classes and interfaces are fine being neither (thus supporting both kinds of subclasses). I feel that for such a proposal to be really useful (but that's true for the interface-based approach as well IMHO), you need a way for the _use site_ to attach an identity vs. value annotation to types that can feature both polarities (Object, interfaces, value-compatible abstract classes). It's perfectly fine to have identity vs. non-identity as a declaration property, for the cases whether that works. E.g. an ArrayList instance will always have identity. An instance of a `value class Point` will always be identity-less. Using modifiers vs. marker interfaces here is mostly an isomorphic move (and I agree that adding modifiers has less impact w.r.t. compatibility). But it feels like both interfaces and decl-site modifiers fall short of having a consistent story for the "neither" case. I feel we'd like programmers to be able to say things like: ``` class Foo { ?? identity Object lock; ?? void runAction(identity Runnable action) { ... } } ``` So, I believe the modifier idea has better potential than marker interfaces, because it scales at the use site in ways that marker interfaces can't (unless we allow intersection types in declarations). But of course I get that adding a new use-site modifier (like `final`) is also not to be taken lightly; aside from grammar conundrums, as you say it will have to be tracked by the type system. Stepping back, you list 4 use cases: > - Dynamic detection > > - Subclass restriction > > - Variable types > > - Type variable bounds IMHO, they are not equally important. And once you give up on "variable types" (as explained above, I believe this use case is not adequately covered by any proposal I've seen), then there's a question of how much incremental value the other use cases add. Dynamic detection can be added cheaply, fine. I also think that, especially in the context of universal generics, we do need a way to say: "this type variable is legacy/identity only" - but that can also be done quite cheaply. IMHO, restricting subclasses doesn't buy much, if you then don't have an adequate way to restrict type declarations at the use sites (for those things that cannot be restricted at the declaration), so I'd be also tempted to leave that use case alone as YAGNI (by teaching developers that synchronizing on Object and interface types is wrong, as we've been already trying to do). P.S. While writing this, a question came up: let's say I have a generic class like this: ``` class IdentityBox { ... } ``` Is IdentityBox a well-formed parameterized type? Based on your description I'm not sure: Runnable has the "neither" polarity, but T expects "identity". With marker interfaces this will just not work. With modifiers we could perhaps allow with unchecked warning? I think it's important that the above type remains legal: I'd expect users to mark their type-variables as "identity" in cases where they just can't migrate the class implementation to support universal type variables. But if that decision results in a source incompatible change (w.r.t. existing parameterizations such as IdentityBox), then it doesn't look like a great story migration-wise. Maurizio From brian.goetz at oracle.com Wed Mar 23 12:01:20 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 23 Mar 2022 08:01:20 -0400 Subject: Alternative to IdentityObject & ValueObject interfaces In-Reply-To: References: Message-ID: <876539ea-6ded-83a6-a3c1-22cc4d3f5d0b@oracle.com> Thanks Dan for putting the work in to provide a credible alternative. Let me add some background for how we came up with these things.? At some point we asked ourselves, what if we had identity and value classes from day 1?? How would that affect the object model?? And we concluded at the time that we probably wouldn't want the identity-indeterminacy of Object, but instead would want something like ??? abstract class Object ??? class IdentityObject extends Object { } ??? abstract class ValueObject extends Object { } So the {Identity,Value}Object interfaces seemed valuable pedagogically, in that they make the object hierarchy reflect the language division.? At the time, we imagined there might be methods that apply to all value objects, that could live in ValueObject. A separate factor is that we were taking operations that were previously total (locking, weak refs) and making them partial. This is scary!? So we wanted a way to make these expressible in the static type system. Unfortunately, the interfaces do not really deliver on either goal, because we can't turn back time.? We still have to deal with `new Object()`, so we can't (yet) make Object abstract. Many signatures will not be changeable from "Object" to "IdentityObject" for reasons of compatibility, unless we make IdentityObject erase to Object (which has its own problems.)? If people use it at all for type bounds, we'll see lots of uses of `Foo`, which will put more pressure on our weak support for intersection types.? And dynamic errors will still happen, because too much of the world was built using signatures that don't express identity-ness. (Kevin will see a parallel to introducing nullness annotations; it might be fine if you build the world that way from scratch, but the transition is painful when you have to interpret an unadorned type as "of unspecified identity-ness.") Several years on, we're still leaning on the same few motivating examples -- capturing things like "I might lock this" in the type system.? That we haven't come up with more killer examples is notable.? And I grow increasingly skeptical of the value of the locking example, both because this is not how concurrent code is written, and because we *still* have to deal with the risk of dynamic errors because most of the world's code has not been (and will not be) written to use IdentityObject throughout. As Dan points out, the main thing we give up by backing off from these interfaces is the static typing; we don't get to use `IdentityObject` as a parameter type, return type, or type bound.? And the only reason we've come up with so far to want that is a pretty lame one -- locking. From a language design perspective, I find that you declare a class with `value class`, but you express the subclassing constraint with `extends IdentityObject`, to be pretty leaky. On 3/22/2022 7:56 PM, Dan Smith wrote: > In response to some encouragement from Remi, John, and others, I've decided to take a closer look at how we might approach the categorization of value and identity classes without relying on the IdentityObject and ValueObject interfaces. > > (For background, see the thread "The interfaces IdentityObject and ValueObject must die" in January.) > > These interfaces have found a number of different uses (enumerated below), while mostly leaning on the existing functionality of interfaces, so there's a pretty good complexity vs. benefit trade-off. But their use has some rough edges, and inserting them everywhere has a nontrivial compatibility impact. Can we do better? > > Language proposal: > > - A "value class" is any class whose instances are all value objects. An "identity class" is any class whose instances are all identity objects. Abstract classes can be value classes or identity classes, or neither. Interfaces can be "value interfaces" or "identity interfaces", or neither. > > - A class/interface can be designated a value class with the 'value' modifier. > > value class Foo {} > abstract value class Bar {} > value interface Baz {} > value record Rec(int x) {} > > A class/interface can be designated an identity class with the 'identity' modifier. > > identity class Foo {} > abstract identity class Bar {} > identity interface Baz {} > identity record Rec(int x) {} > > - Concrete classes with neither modifier are implicitly 'identity'; abstract classes with neither modifier, but with certain identity-dependent features (instance fields, initializers, synchronized methods, ...) are implicitly 'identity' (possibly with a warning). Other abstract classes and interfaces are fine being neither (thus supporting both kinds of subclasses). > > - The properties are inherited: if you extend a value class/interface, you are a value/class interface. (Same for identity classes/interfaces.) It's an error to be both. > > - The usual restrictions apply to value classes, both concrete and abstract; and also to "neither" abstract classes, if they haven't been implicitly made 'identity'. > > - An API ('Object.isValueObject()'?) allows for dynamically distinguishing between value objects and identity objects. The reflection API (in java.lang.Class) allows for detection of value classes/interfaces, identity classes/interfaces, and "neither" classes/interfaces. > > - TBD whether/how we track these properties statically so that the type system catch mismatches between non-identity class types and uses that assume identity. > > JVM proposal: > > - Same conceptual framework. > > - Classes can be ACC_VALUE, ACC_IDENTITY, or neither. > > - Legacy-version classes are implicitly ACC_IDENTITY. Legacy interfaces are not. Optionally, modern-version concrete classes are also implicitly ACC_IDENTITY. > > (Trying out this alternative approach to abstract classes: there's no more ACC_PERMITS_VALUE; instead, legacy-version abstract classes are automatically ACC_IDENTITY, and modern-version abstract classes permit value subclasses unless they opt out with ACC_IDENTITY. It's the bytecode generator's responsibility to set these flags appropriately. Conceptually cleaner, maybe too risky...) > > - At class load time, we inherit value/identity-ness and check for conflicts. It's okay to have neither flag set but inherit the property from one of your supers. We also enforce constraints on value classes and "neither" abstract classes. > > --- > > So how does this score as a replacement for the list of features enabled by the interfaces? > > - Dynamic detection: 'obj instanceof ValueObject' is quite straightforward; if we can replace that with 'obj.isValueObject()', that feels about equally useful. (I'd be more pessimistic about something like 'Objects.isValueObject(obj)'.) > > - Subclass restriction: 'implements IdentityObject' has been replaced with the 'identity' modifier. Complexity cost of special modifiers seems on par with the complexity of special rules for inferring and checking the superinterfaces. I think it's a win that we use the 'value' modifier and "value" terminology for all kinds of classes/interfaces, not just concrete classes. > > - Variable types: I don't see a good way to get the equivalent of an 'IdentityObject' type. It would involve tracking the 'identity' property through the whole type system, which seems like a huge burden for the occasional "I'm not sure you can lock on that" error message. So we'd probably need to be okay letting that go. Fortunately, I'm not sure it's a great loss?lots of code today seems happy using 'Object' when it means, informally, "object that I've created for the sole purpose of locking". > > - Type variable bounds: this one seems more achievable, by using the 'value' and 'identity' keywords to indicate a new kind of bounds check (''). Again, it's added complexity, but it's more localized. We should think more about the use cases, and decide if it passes the cost/benefit analysis. If not, nothing else depends on this, so it could be dropped. (Or left to a future, more general feature?) > > - Documentation: we've lost the handy javadoc location to put some explanations about identity & value objects in a place that curious programmers can easily stumble on. Anything we want to say needs to go in JLS/JVMS (or perhaps the java.lang.Object javadoc). > > - Compatibility: pretty clear win here. No interface injection means tools that depend on reflection results won't be broken. (We've found a significant number of these problems in our own code/tests, FWIW.) No new static types means inference results won't change. There's less risk of incompatibilities when adding/removing the 'identity' and 'value' keywords (although there can still be source, binary, and behavioral incompatibilities). > From forax at univ-mlv.fr Wed Mar 23 12:43:28 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 23 Mar 2022 13:43:28 +0100 (CET) Subject: Alternative to IdentityObject & ValueObject interfaces In-Reply-To: <876539ea-6ded-83a6-a3c1-22cc4d3f5d0b@oracle.com> References: <876539ea-6ded-83a6-a3c1-22cc4d3f5d0b@oracle.com> Message-ID: <629162988.345586.1648039408725.JavaMail.zimbra@u-pem.fr> Hi Brian, i've maybe have twisted mind but i read your email as a rebuttal of both IdentityObject/ValueObject and identity/value modifiers. As you said, an identity object and a value object are less dis-similar now that they were in the past: a value class now reuse the method equals and hashCode of j.l.Object instead of coming with it's own definition, a value class is now nullable.I agree with you that synchronized is not a real issue so as Dan H. said, the real remaining issue is weak refs. Now, if there is such a small differences between an identity object and a value object, do we really need to introduce a mechanism to separate them in term of typing ? R?mi > From: "Brian Goetz" > To: "daniel smith" , "valhalla-spec-experts" > > Sent: Wednesday, March 23, 2022 1:01:20 PM > Subject: Re: Alternative to IdentityObject & ValueObject interfaces > Thanks Dan for putting the work in to provide a credible alternative. > Let me add some background for how we came up with these things. At some point > we asked ourselves, what if we had identity and value classes from day 1? How > would that affect the object model? And we concluded at the time that we > probably wouldn't want the identity-indeterminacy of Object, but instead would > want something like > abstract class Object > class IdentityObject extends Object { } > abstract class ValueObject extends Object { } > So the {Identity,Value}Object interfaces seemed valuable pedagogically, in that > they make the object hierarchy reflect the language division. At the time, we > imagined there might be methods that apply to all value objects, that could > live in ValueObject. > A separate factor is that we were taking operations that were previously total > (locking, weak refs) and making them partial. This is scary! So we wanted a way > to make these expressible in the static type system. > Unfortunately, the interfaces do not really deliver on either goal, because we > can't turn back time. We still have to deal with `new Object()`, so we can't > (yet) make Object abstract. Many signatures will not be changeable from > "Object" to "IdentityObject" for reasons of compatibility, unless we make > IdentityObject erase to Object (which has its own problems.) If people use it > at all for type bounds, we'll see lots of uses of `Foo Bar&IdentityObject>`, which will put more pressure on our weak support for > intersection types. And dynamic errors will still happen, because too much of > the world was built using signatures that don't express identity-ness. (Kevin > will see a parallel to introducing nullness annotations; it might be fine if > you build the world that way from scratch, but the transition is painful when > you have to interpret an unadorned type as "of unspecified identity-ness.") > Several years on, we're still leaning on the same few motivating examples -- > capturing things like "I might lock this" in the type system. That we haven't > come up with more killer examples is notable. And I grow increasingly skeptical > of the value of the locking example, both because this is not how concurrent > code is written, and because we *still* have to deal with the risk of dynamic > errors because most of the world's code has not been (and will not be) written > to use IdentityObject throughout. > As Dan points out, the main thing we give up by backing off from these > interfaces is the static typing; we don't get to use `IdentityObject` as a > parameter type, return type, or type bound. And the only reason we've come up > with so far to want that is a pretty lame one -- locking. > From a language design perspective, I find that you declare a class with `value > class`, but you express the subclassing constraint with `extends > IdentityObject`, to be pretty leaky. > On 3/22/2022 7:56 PM, Dan Smith wrote: >> In response to some encouragement from Remi, John, and others, I've decided to >> take a closer look at how we might approach the categorization of value and >> identity classes without relying on the IdentityObject and ValueObject >> interfaces. >> (For background, see the thread "The interfaces IdentityObject and ValueObject >> must die" in January.) >> These interfaces have found a number of different uses (enumerated below), while >> mostly leaning on the existing functionality of interfaces, so there's a pretty >> good complexity vs. benefit trade-off. But their use has some rough edges, and >> inserting them everywhere has a nontrivial compatibility impact. Can we do >> better? >> Language proposal: >> - A "value class" is any class whose instances are all value objects. An >> "identity class" is any class whose instances are all identity objects. >> Abstract classes can be value classes or identity classes, or neither. >> Interfaces can be "value interfaces" or "identity interfaces", or neither. >> - A class/interface can be designated a value class with the 'value' modifier. >> value class Foo {} >> abstract value class Bar {} >> value interface Baz {} >> value record Rec(int x) {} >> A class/interface can be designated an identity class with the 'identity' >> modifier. >> identity class Foo {} >> abstract identity class Bar {} >> identity interface Baz {} >> identity record Rec(int x) {} >> - Concrete classes with neither modifier are implicitly 'identity'; abstract >> classes with neither modifier, but with certain identity-dependent features >> (instance fields, initializers, synchronized methods, ...) are implicitly >> 'identity' (possibly with a warning). Other abstract classes and interfaces are >> fine being neither (thus supporting both kinds of subclasses). >> - The properties are inherited: if you extend a value class/interface, you are a >> value/class interface. (Same for identity classes/interfaces.) It's an error to >> be both. >> - The usual restrictions apply to value classes, both concrete and abstract; and >> also to "neither" abstract classes, if they haven't been implicitly made >> 'identity'. >> - An API ('Object.isValueObject()'?) allows for dynamically distinguishing >> between value objects and identity objects. The reflection API (in >> java.lang.Class) allows for detection of value classes/interfaces, identity >> classes/interfaces, and "neither" classes/interfaces. >> - TBD whether/how we track these properties statically so that the type system >> catch mismatches between non-identity class types and uses that assume >> identity. >> JVM proposal: >> - Same conceptual framework. >> - Classes can be ACC_VALUE, ACC_IDENTITY, or neither. >> - Legacy-version classes are implicitly ACC_IDENTITY. Legacy interfaces are not. >> Optionally, modern-version concrete classes are also implicitly ACC_IDENTITY. >> (Trying out this alternative approach to abstract classes: there's no more >> ACC_PERMITS_VALUE; instead, legacy-version abstract classes are automatically >> ACC_IDENTITY, and modern-version abstract classes permit value subclasses >> unless they opt out with ACC_IDENTITY. It's the bytecode generator's >> responsibility to set these flags appropriately. Conceptually cleaner, maybe >> too risky...) >> - At class load time, we inherit value/identity-ness and check for conflicts. >> It's okay to have neither flag set but inherit the property from one of your >> supers. We also enforce constraints on value classes and "neither" abstract >> classes. >> --- >> So how does this score as a replacement for the list of features enabled by the >> interfaces? >> - Dynamic detection: 'obj instanceof ValueObject' is quite straightforward; if >> we can replace that with 'obj.isValueObject()', that feels about equally >> useful. (I'd be more pessimistic about something like >> 'Objects.isValueObject(obj)'.) >> - Subclass restriction: 'implements IdentityObject' has been replaced with the >> 'identity' modifier. Complexity cost of special modifiers seems on par with the >> complexity of special rules for inferring and checking the superinterfaces. I >> think it's a win that we use the 'value' modifier and "value" terminology for >> all kinds of classes/interfaces, not just concrete classes. >> - Variable types: I don't see a good way to get the equivalent of an >> 'IdentityObject' type. It would involve tracking the 'identity' property >> through the whole type system, which seems like a huge burden for the >> occasional "I'm not sure you can lock on that" error message. So we'd probably >> need to be okay letting that go. Fortunately, I'm not sure it's a great >> loss?lots of code today seems happy using 'Object' when it means, informally, >> "object that I've created for the sole purpose of locking". >> - Type variable bounds: this one seems more achievable, by using the 'value' and >> 'identity' keywords to indicate a new kind of bounds check ('> extends Runnable>'). Again, it's added complexity, but it's more localized. We >> should think more about the use cases, and decide if it passes the cost/benefit >> analysis. If not, nothing else depends on this, so it could be dropped. (Or >> left to a future, more general feature?) >> - Documentation: we've lost the handy javadoc location to put some explanations >> about identity & value objects in a place that curious programmers can easily >> stumble on. Anything we want to say needs to go in JLS/JVMS (or perhaps the >> java.lang.Object javadoc). >> - Compatibility: pretty clear win here. No interface injection means tools that >> depend on reflection results won't be broken. (We've found a significant number >> of these problems in our own code/tests, FWIW.) No new static types means >> inference results won't change. There's less risk of incompatibilities when >> adding/removing the 'identity' and 'value' keywords (although there can still >> be source, binary, and behavioral incompatibilities). From forax at univ-mlv.fr Wed Mar 23 12:54:38 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 23 Mar 2022 13:54:38 +0100 (CET) Subject: Alternative to IdentityObject & ValueObject interfaces In-Reply-To: <58d81e08-5fc1-bb3e-3710-adac2c14dc2d@oracle.com> References: <58d81e08-5fc1-bb3e-3710-adac2c14dc2d@oracle.com> Message-ID: <713855993.351192.1648040078874.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Maurizio Cimadamore" > To: "daniel smith" , "valhalla-spec-experts" > Sent: Wednesday, March 23, 2022 11:23:26 AM > Subject: Re: Alternative to IdentityObject & ValueObject interfaces > On 22/03/2022 23:56, Dan Smith wrote: >> Other abstract classes and interfaces are fine being neither (thus supporting >> both kinds of subclasses). > > I feel that for such a proposal to be really useful (but that's true for > the interface-based approach as well IMHO), you need a way for the _use > site_ to attach an identity vs. value annotation to types that can > feature both polarities (Object, interfaces, value-compatible abstract > classes). > > It's perfectly fine to have identity vs. non-identity as a declaration > property, for the cases whether that works. E.g. an ArrayList instance > will always have identity. An instance of a `value class Point` will > always be identity-less. Using modifiers vs. marker interfaces here is > mostly an isomorphic move (and I agree that adding modifiers has less > impact w.r.t. compatibility). > > But it feels like both interfaces and decl-site modifiers fall short of > having a consistent story for the "neither" case. I feel we'd like > programmers to be able to say things like: > > ``` > class Foo { > ?? identity Object lock; > > ?? void runAction(identity Runnable action) { ... } > } > ``` > > So, I believe the modifier idea has better potential than marker > interfaces, because it scales at the use site in ways that marker > interfaces can't (unless we allow intersection types in declarations). > But of course I get that adding a new use-site modifier (like `final`) > is also not to be taken lightly; aside from grammar conundrums, as you > say it will have to be tracked by the type system. > > Stepping back, you list 4 use cases: > >> - Dynamic detection >> >> - Subclass restriction >> >> - Variable types >> >> - Type variable bounds > IMHO, they are not equally important. And once you give up on "variable > types" (as explained above, I believe this use case is not adequately > covered by any proposal I've seen), then there's a question of how much > incremental value the other use cases add. Dynamic detection can be > added cheaply, fine. I also think that, especially in the context of > universal generics, we do need a way to say: "this type variable is > legacy/identity only" - but that can also be done quite cheaply. IMHO, > restricting subclasses doesn't buy much, if you then don't have an > adequate way to restrict type declarations at the use sites (for those > things that cannot be restricted at the declaration), so I'd be also > tempted to leave that use case alone as YAGNI (by teaching developers > that synchronizing on Object and interface types is wrong, as we've been > already trying to do). > > P.S. > > While writing this, a question came up: let's say I have a generic class > like this: > > ``` > class IdentityBox { ... } > ``` > > Is IdentityBox a well-formed parameterized type? Based on your > description I'm not sure: Runnable has the "neither" polarity, but T > expects "identity". With marker interfaces this will just not work. With > modifiers we could perhaps allow with unchecked warning? > > I think it's important that the above type remains legal: I'd expect > users to mark their type-variables as "identity" in cases where they > just can't migrate the class implementation to support universal type > variables. But if that decision results in a source incompatible change > (w.r.t. existing parameterizations such as IdentityBox), then > it doesn't look like a great story migration-wise. Yes ! The neither types (Object, interfaces, abstract classes) act as an eraser of the identity|value bit if we do not support use site identity|value modifier, something like IdentityBox. And given that there is already existing codes in the wild that does not specify "identity" or "value" we need a kind of unsafe conversion/unsafe warning between the new world with use site type annotation and the old world with no type annotation. As Brian said to Kevin, it's a problem very similar to the introduction of a null type annotation, it will be painful. > > Maurizio R?mi From brian.goetz at oracle.com Wed Mar 23 13:58:02 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 23 Mar 2022 09:58:02 -0400 Subject: [External] : Re: Alternative to IdentityObject & ValueObject interfaces In-Reply-To: References: <876539ea-6ded-83a6-a3c1-22cc4d3f5d0b@oracle.com> Message-ID: <69d6ef5b-110d-f74b-c68c-c82f8e0cbc55@oracle.com> My thinking on this topic has evolved a bit.? At first, we thought about conditional methods as being completely ad-hoc, such as: interface List { ??? ??? long sum(); } Here, the sum() method exists as an island in various specializations.? This was practical in the first iteration of the template classfile mechanism (the "standing on segments" one), but adds a lot of complexity.? Our current thinking is that conditionality comes from constraint.? That is, the rules about overloads are performed without respect to conditionality, and then the conditionality constraints restrict the members of a particular specialization.? So the int-sums-to-long example is out, and instead we'd end up with the more parametric ??? > ??? T sum(); We likely need a better way of describing constraints.? The obvious kinds of constraints are `T=int` and `T <: Comparable`, but these don't go far enough, nor are they likely to be flexible enough.? The general way to describe a predicate on types is type classes.? And "has identity" or "has no identity" are easily described by a built-in type class, without having to pollute the hierarchy. The key here for purposes of specialization is that we be able to con(dy)jure a constant witness to the type class at the point of describing the specialized List, so that witness-provided behavior can be folded into the specialized implementation. > During our various discussions, we've also used `IdentityObject` and > `ValueObject` as constraints in the t-vars / parametric VM proposal to > address methods that are only partially applicable. We've also talked > about using that as a signal to allow locking and other > identity-operations to compile inside generic code that we can > statically know won't have to deal with values. > > Does giving up on having VO/IO in the type system change our > approaches to either the parametric vm future or identity operations > in generic code? It sounds like we're willing to give up on the > second but I don't have a good sense of what this does to the > parametric VM. > > --Dan > From daniel.smith at oracle.com Wed Mar 23 14:38:37 2022 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 23 Mar 2022 14:38:37 +0000 Subject: EG meeting, 2022-03-23 Message-ID: <26ACFAD2-83E2-44D4-AC51-ECD0FA05EB1F@oracle.com> EG Zoom meeting today at *4pm* UTC (9am PDT, 12pm EDT). Thanks for the feedback in the "Alternative to IdentityObject & ValueObject interfaces" thread. We can continue that discussion. From daniel.smith at oracle.com Thu Mar 24 02:51:15 2022 From: daniel.smith at oracle.com (Dan Smith) Date: Thu, 24 Mar 2022 02:51:15 +0000 Subject: Alternative to IdentityObject & ValueObject interfaces In-Reply-To: References: Message-ID: On Mar 22, 2022, at 5:56 PM, Dan Smith > wrote: - Variable types: I don't see a good way to get the equivalent of an 'IdentityObject' type. It would involve tracking the 'identity' property through the whole type system, which seems like a huge burden for the occasional "I'm not sure you can lock on that" error message. So we'd probably need to be okay letting that go. Fortunately, I'm not sure it's a great loss?lots of code today seems happy using 'Object' when it means, informally, "object that I've created for the sole purpose of locking". - Type variable bounds: this one seems more achievable, by using the 'value' and 'identity' keywords to indicate a new kind of bounds check (''). Again, it's added complexity, but it's more localized. We should think more about the use cases, and decide if it passes the cost/benefit analysis. If not, nothing else depends on this, so it could be dropped. (Or left to a future, more general feature?) Per today's discussion, this part seems to be the central question: how much value can we expect to get out of compile-time checking? Stepping back from the type system details (that is, the below discussion applies whether we're using interfaces, modifiers on types, or something else), it's worth asking what errors we hope these features will help detect. We identified a couple of them today (and earlier in this thread): - 'synchronized' on a value object - storing a value object in a weak reference (in a world in which weak references don't support value objects) Two questions: 1) What are the requirements for the analysis? How effective can it be? The type system is going to have three kinds of types: - types that guarantee identity objects - types that guarantee value objects - types that include both kinds of objects That third kind are a problem: we can specify checks with false positives (programmer knows the operation is safe, compiler complains anyway) or false negatives (operation isn't safe, but the compiler lets it go). For example, for the 'synchronized' operation, it's straightforward for the compiler to complain on a value class type. But what do we do with 'synchronized' on some interface type? Say we go the false positive route; the check probably looks like a warning ("you might be synchronizing on a value object"). In this case: - We've just created a bunch of warnings in existing code that people will probably just @SuppressWarnings rather than try to address through the types, because changing the types (throughout the flow of data) is a lot of work and comes with compatibility risks. - Even in totally new code, if I'm not working with a specific identity class, I'm not sure I would bother fiddling with the types to get better checking. It seems really tedious. (For example, changing an interface-typed parameter 'Foo' to intersection type 'Foo & IdentityObject'.) If we prefer to allow false negatives, then it's straightforward: value class types get errors, other types do not. There's no need for extra type system features. (E.g., 'IdentityObject' and 'Object' get treated exactly the same by 'synchronized'.) For weak references, it definitely doesn't make sense to reject types like WeakReference?that would be a compatibility mess. We could warn, but again, lots of false positive risk; and warnings don't generalize to general-purpose use of generics. I think again the best we could hope to do is to reject value class types. But something like 'T extends IdentityObject' doesn't accomplish this, because it excludes the "both kinds" types. Instead, we'd need something like 'T !extends ValueObject'. 2) Are these the best use cases we have? and are they really all that important? These are the ones we've focused on, but maybe we can think of other applications. Other use cases would similarly have to involve the differences in runtime semantics. Our two use cases share the property that they detect a runtime error (either an expression that we know will always throw, or with more aggressive checking an expression that *could* throw). That's helpful, but I do wonder how common such errors will be. We could do a bunch of type system work to detect division by zero, but nobody's asking for that because programmers just tend to avoid making that mistake already. Synchronization: best practice is already to "own" the object being locked on, and that kind of knowledge isn't tracked by the type system. Doesn't seem that different for programmers to also be aware of whether their locking objects are identity objects without type system help. Weak references: a WeakReference seems like an unlikely scenario?why are you trying to manage GC for a value object? (Assuming we've provided an alternative API to manage references *within* value objects, do cacheing, etc.) So most runtime errors will fall into the WeakReference or WeakReference category, and again there's a trade-off here between detecting real errors and reporting a bunch of false positives. From brian.goetz at oracle.com Thu Mar 24 12:46:44 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 24 Mar 2022 08:46:44 -0400 Subject: Alternative to IdentityObject & ValueObject interfaces In-Reply-To: References: Message-ID: <0211ea4b-64ec-524d-5827-27131c46beeb@oracle.com> On 3/23/2022 10:51 PM, Dan Smith wrote: >> On Mar 22, 2022, at 5:56 PM, Dan Smith wrote: >> >> - Variable types: I don't see a good way to get the equivalent of an >> 'IdentityObject' type. It would involve tracking the 'identity' >> property through the whole type system, which seems like a huge >> burden for the occasional "I'm not sure you can lock on that" error >> message. So we'd probably need to be okay letting that go. >> Fortunately, I'm not sure it's a great loss?lots of code today seems >> happy using 'Object' when it means, informally, "object that I've >> created for the sole purpose of locking". >> >> - Type variable bounds: this one seems more achievable, by using the >> 'value' and 'identity' keywords to indicate a new kind of bounds >> check (''). Again, it's added >> complexity, but it's more localized. We should think more about the >> use cases, and decide if it passes the cost/benefit analysis. If not, >> nothing else depends on this, so it could be dropped. (Or left to a >> future, more general feature?) > > Per today's discussion, this part seems to be the central question: > how much value can we expect to get out of compile-time checking? This is indeed the question.? There's both a "theory" and a "practice" aspect, too. > The type system is going to have three kinds of types: > - types that guarantee identity objects > - types that guarantee value objects > - types that include both kinds of objects > > That third kind are a problem: we can specify checks with false > positives (programmer knows the operation is safe, compiler complains > anyway) or false negatives (operation isn't safe, but the compiler > lets it go). Flowing {Value,Identity}Object property is likely to require shoring up intersection types too, since we can express Runnable&IdentityObject as a type bound, but not as a denotable type.? Var helps a little here but ultimately this is a hole through which information will drain. The arguments you make here are compelling to me, that while it might work in theory, in practice there are too many holes: ?- Legacy code that already deals in Object / interfaces and is not going to change ?- Even in new code, I suspect people will continue to do so, because as you say, it is tedious for marginal value ?- The lack of intersection types will make it worse ?- Because of the above, many of the errors would be more like warnings, making it even weaker All of this sounds like a recipe for "new complexity that almost no one will actually use." From forax at univ-mlv.fr Thu Mar 24 14:00:29 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 24 Mar 2022 15:00:29 +0100 (CET) Subject: Alternative to IdentityObject & ValueObject interfaces In-Reply-To: <0211ea4b-64ec-524d-5827-27131c46beeb@oracle.com> References: <0211ea4b-64ec-524d-5827-27131c46beeb@oracle.com> Message-ID: <1224883993.1109132.1648130429286.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "daniel smith" , "valhalla-spec-experts" > > Sent: Thursday, March 24, 2022 1:46:44 PM > Subject: Re: Alternative to IdentityObject & ValueObject interfaces > On 3/23/2022 10:51 PM, Dan Smith wrote: >>> On Mar 22, 2022, at 5:56 PM, Dan Smith < [ mailto:daniel.smith at oracle.com | >>> daniel.smith at oracle.com ] > wrote: >>> - Variable types: I don't see a good way to get the equivalent of an >>> 'IdentityObject' type. It would involve tracking the 'identity' property >>> through the whole type system, which seems like a huge burden for the >>> occasional "I'm not sure you can lock on that" error message. So we'd probably >>> need to be okay letting that go. Fortunately, I'm not sure it's a great >>> loss?lots of code today seems happy using 'Object' when it means, informally, >>> "object that I've created for the sole purpose of locking". >>> - Type variable bounds: this one seems more achievable, by using the 'value' and >>> 'identity' keywords to indicate a new kind of bounds check ('>> extends Runnable>'). Again, it's added complexity, but it's more localized. We >>> should think more about the use cases, and decide if it passes the cost/benefit >>> analysis. If not, nothing else depends on this, so it could be dropped. (Or >>> left to a future, more general feature?) >> Per today's discussion, this part seems to be the central question: how much >> value can we expect to get out of compile-time checking? > This is indeed the question. There's both a "theory" and a "practice" aspect, > too. >> The type system is going to have three kinds of types: >> - types that guarantee identity objects >> - types that guarantee value objects >> - types that include both kinds of objects >> That third kind are a problem: we can specify checks with false positives >> (programmer knows the operation is safe, compiler complains anyway) or false >> negatives (operation isn't safe, but the compiler lets it go). > Flowing {Value,Identity}Object property is likely to require shoring up > intersection types too, since we can express Runnable&IdentityObject as a type > bound, but not as a denotable type. Var helps a little here but ultimately this > is a hole through which information will drain. > The arguments you make here are compelling to me, that while it might work in > theory, in practice there are too many holes: > - Legacy code that already deals in Object / interfaces and is not going to > change > - Even in new code, I suspect people will continue to do so, because as you say, > it is tedious for marginal value > - The lack of intersection types will make it worse > - Because of the above, many of the errors would be more like warnings, making > it even weaker > All of this sounds like a recipe for "new complexity that almost no one will > actually use." I agree, so if we drop the idea of having identity vs value info into the type system, the follow-up question is "should we restrict inheritance or not ?" Classes are tagged with value or not, and for an abstract class or an interface by default they allow both value types or identity types as subtypes. Do we need more, i.e. be able to restrict subtypes of an abstract class/interface to be value types (or identity types) only ? Yesterday, Dan S. talk about a user being able to restrict a hierarchy to be identity classes only. This will not help already existing codes but may help new codes by instead of having IdentityObject in the JDK, let a user define his own interface that play the same role as IdentityObject but tailored to his problem ? Or do we consider that even that use case does not worth it's own weight ? R?mi From daniel.smith at oracle.com Thu Mar 31 21:48:29 2022 From: daniel.smith at oracle.com (Dan Smith) Date: Thu, 31 Mar 2022 21:48:29 +0000 Subject: Object as a concrete class Message-ID: One of our requirements has been that 'new Object()' must be re-interpreted (both at compile time and run time) to instantiate some other class?Object is effectively abstract. The motivation here is that every class instance must be identified as an identity object or a value object, and the mechanism for that is the corresponding class declaration. But if 'Object' were an identity class, then no value class could extend it. That is, this code needs to work: assert new Object() instanceof IdentityObject; assert new Point(1,2) instanceof ValueObject; *However*, as Remi was eager to pursue awhile ago, in a world in which class modifiers, not superinterfaces, convey the identity/value distinction, we're no longer so closely tied to class declarations, and it becomes easier to make Object a special case. This code still needs to work: assert new Object().hasIdentity(); assert !new Point().hasIdentity(); But the 'hasIdentity' method can contain arbitrary logic, and doesn't necessarily need to correlate with 'getClass().isIdentityClass()'. So we could have a world in which some objects are instances of a concrete class that is neither an identity class nor a value class, but where those objects are still identity objects. I don't see a useful way to generalize this to other "both kinds" classes (for example, any class with an instance field must be an identity class or a value class). But since we have to make special allowances for Object one way or another, it does seem plausible that we let 'new Object()' continue to create direct instances of class Object, and then specify the following special rules: - All concrete, non-value classes are implicitly identity classes *except for Object* - The 'new' bytecode is allowed to be used with concrete identity classes *and class Object* - Identity objects include instances of identity classes, arrays, *and instances of Object*; 'hasIdentity' reflects this - [anything else?] There's some extra complexity here, but balanced against the cost of making every Java programmer adjust their model of what 'new Object()' means, and corresponding coding style refactorings, it seems like a win. Thoughts? From kevinb at google.com Thu Mar 31 23:48:13 2022 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 31 Mar 2022 16:48:13 -0700 Subject: Object as a concrete class In-Reply-To: References: Message-ID: So either way, I'd still really want for `new Object()` to get a warning, and for Object to eventually become literally abstract (how long from now, I don't as much care). Why: the fact that Object isn't abstract actually detracts from people's understanding of both abstractness and Object. By rights it should be; abstract means lacking, and it's lacking any distinguishable type or behavior. Could we? Do that? If we can, then does it matter as much what we do here? It seems to me that most people will fix the warning like: `new Object() {}`, and good on them, but we could just do something roughly equivalent for them until they do...