From brian.goetz at oracle.com Mon Dec 7 20:55:51 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 7 Dec 2015 15:55:51 -0500 Subject: Welcome to valhalla-spec-experts Message-ID: <5665F257.2070007@oracle.com> Welcome to valhalla-spec-experts! This list is an OpenJDK-hosted precursor to (hopefully) an eventual JCP Expert Group. The initial membership includes representation from Oracle, IBM, JetBrains, Red Hat, and Google, as well as individuals from the OpenJDK and Scala communities. Project Valhalla echoes the approach of Project Lambda, which started with a single, simple-sounding idea (add lambda expressions to the Java language) -- but the consequences of adding this "simple feature" were anything but simple, sending ripples throughout the JVM, language, and libraries. In addition to the many technical challenges, the overriding stewardship challenge was to find the natural scope for these ripples; damping them too aggressively would would result in a key feature looking "nailed onto the side", but at the same time, we needed to avoid getting too carried away by the excitement of adding cool new features "because we can" -- because otherwise the project would have exceeded its delivery and complexity budgets. Project Valhalla's core feature is adding user-definable value types to the JVM type system (http://cr.openjdk.java.net/~jrose/values/values-0.html). The consequences of adding this simple-sounding feature, however, are deep and far-ranging, rippling into the language type system, generics, the inheritance model, the bytecode set, and the core library APIs. To illustrate: if we had value types, but couldn't instantiate generic types with a value (i.e., no ArrayList without appealing to boxing), this would be pretty useless; if boxing were a good enough solution here, we wouldn't even have bothered with value types. So pulling on the "value types" string, we get the need to generify over both references and values. And pulling on that string some more, we discover that our existing Collections library has methods (e.g., Map.get()) which bake in assumptions that "T is always an Object", leading us to a reexamination of these core APIs. Which in turn leads us to exploring additional features for API migration, which becomes more relevant as our core APIs approach being old enough to drink. Similarly, as we add value types, there are some core classes (e.g., LocalDateTime, Optional) which always wanted to be values, and would benefit tremendously from being so. It therefore becomes desirable to migrate these to be value types, in a way that's both binary and source compatible with existing clients and subclasses? We've been exploring the bounds of this problem space for more than a year now, and we think we're approaching a reasonable understanding of the scope and breadth of the various design tradeoffs. Over the next few weeks, I'm going to pick some initial issues for discussion and try to frame them so that their overlap with other aspects is minimized. Stay tuned! From brian.goetz at oracle.com Wed Dec 16 17:18:27 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 16 Dec 2015 12:18:27 -0500 Subject: Migrating methods in Collections Message-ID: <56719CE3.1030600@oracle.com> I'd like to start with the "What about Collections" discussion. While this topic is inextricably linked to all the other topics (value types, generic specialization, nullability, etc), to avoid getting overwhelmed, let's start by trying to keep our focus on the Collections API and the process of migrating it into an anyfied world. So, I'll offer an ahead-of-time warning: there's going to be lots here that you will want to question, and to many of those questions I'm going to answer "OK, but let's come back to that later", because I'd like to stay focused for the time being on the Collection-related issues. (Let's also try to create separate threads for separate discussion topics, though I know this is harder to remember to do.) Most of these are already prototyped in some form in the Valhalla repos (which at least demonstrates that they are plausible.) BackgroundMaterial ------------------- The last "State of the Specialization" (http://cr.openjdk.java.net/~briangoetz/valhalla/specialization.html, Dec 2014) outlines an early approach towards the specialization challenges and features. My talk at JVMLS 2015 (https://www.youtube.com/watch?v=uNgAFSUXuwc) outlines the progress that had been made between last December and August. The State of the Values (http://cr.openjdk.java.net/~jrose/values/values-0.html) outlines the basic model for value types. If you've not read/viewed these, that's a good place to start. Core Goals ---------- The key requirement that drove many of the design choices in Java 5 when adding generics is *gradual migration compatibility*. This means it must be possible to evolve a non-generic class to be generic in a manner that is binary-compatible and source-compatible for both clients and subtypes. So an existing client or subtype should continue to work whether it is recompiled or not, and generified or not, and it should be equally practical to generify subtypes and clients at the same time as a library class, or later, or never. We adopt similar goals for "anyfication" -- anyfying a class should be binary- and source-compatible for clients and subclasses, and it should be possible to anyfy a generic class without requiring that either its clients or subclasses be anyfied, and whether or not they are recompiled. Basic Model, Ultra-summarized ----------------------------- Just as the requirement of gradual migration compatibility was a significant design constraint in Java 5 (and which pushed us strongly towards erasure), this requirement is going to be a significant constraint here. Essentially, this means that parameterizations like ArrayList need to continue to be erased (otherwise they'd not be binary compatible), but parameterizations like ArrayList need to be reified (since we cannot erase int to Object without boxing, and if boxing were good enough, we wouldn't be bothering at all.) This pushes us towards an interpretation of anyfied generics that is erased over reference instantiations and reified over value instantiations. At the same time, there is code that is valid under the assumption that T <: Object that would not be valid if T can take on 'int'. These include domain assumptions (e.g., assignment to null) and identity-based assumptions (a T can be used as a monitor lock.) Accordingly, we cannot simply reinterpret existing reference-generic code to be any-generic; we need some indication from the user that they are opting into the broadened genericity domain (and therefore willing to accept the limitations of this broadened domain.) Our working model for this is to annotate the declaration of a type variable with an "any" modifier (e.g., "class Foo"). We sometimes use the abbreviation 'tvar' to describe a type variable, and 'avar' to describe an any-type variable (similarly, 'ivar' for an inference type variable.) Conceptually, the enhanced generics model is simple: - Non-anyfied generic code means exactly the same thing in Valhalla as it does in Java 5-9; - A reference instantiation of an anyfied generic class means exactly the same thing in Valhalla as it does under erased generics; - A class or method tvar declaration can be annotated with 'any', in which case it can range over value types (including primitives) as well as reference types. There are some restrictions on what you can do to an avar-typed variable to reflect the fact that values may not be nullable, have no monitor locks, that they don't participate in a subtyping relationship with Object (instead, they have a boxing conversion to Object), and that a V[] is not a subtype of Object[]; - There is a new wildcard type, Foo, which is a supertype of both value and reference instantiations of Foo. The enhanced model embraces erasure more explicitly than the current model. Migration Challenges -------------------- Ideally, we could just sprinkle 'any' over our codebase, and be done -- and hopefully for many classes, this is all that will be required. But for some classes, there may be migration challenges, which come in two forms: - A method signature is fundamentally incompatible with anyfication (such as Map.get(), which returns null when the mapping is not present, which makes no sense if V=int); - The existing implementation appeals to assumptions of object-ness, and needs to be adjusted. For the purposes of this memo, I'm going to restrict myself to the first category; we'll come back to the second category later. Here's a more-or-less complete list of methods in Collections that may need some sort of migration help. A blank cell indicates "same as above." *Class** * *Method** * *Issues** * Collection contains(Object) Assumes all Rs are castable to Object without boxing. remove(Object) removeAll(Collection) Should take Collection. Collection not a supertype of all instantiations. retainAll(Collection) containsAll(Collection) toArray() Returns Object[], which is not a supertype of V[] for any value type V. toArray(T[]) Not strictly problematic, but relies on runtime checking that E <: T and on reflection for implementation. List remove(int) Not strictly problematic, but if remove(Object) becomes remove(T), will be a confusing overload. indexOf(Object) Same as contains(Object). Also, would be better as a lambda-consuming method, now that we have lambdas. lastIndexOf(Object) Map containsKey(Object) Same as contains(Object) containsValue(Object) remove(Object) put(K,V) Returns what was there before, or null if there was no mapping. Not strictly problematic, but doesn't project so well to non-nullable Vs. putIfAbsent, replace, computeIfAbsent get(K) Uses null to signal no mapping getOrDefault(Object, V) Same as contains(Object) Queue poll(),peek() Uses null to signal no element Deque poll(), peek() pollFirst(), pollLast() peekFirst(), peekLast() removeFirstOccurrence(), removeLastOccurrence() Same as remove(Object) It has often been suggested that "maybe it is time for Collections 2.0"; some would like to declare this to be the end of the road for existing collections. However, redoing collections is only part of the battle; the Collection types are riddled throughout the JDK and other libraries, so we would need a mechanism for migrating all the methods that consume/dispense collections, and if we were to build new collections, saddling them with compatibility with old collections would be a big stone to hang around their neck. So I think this path is less appealing that it might first appear. However, the techniques described here allow us to fix some of the errors of the past. Partial Methods --------------- Our approach is guided by the following observation: not all methods declared in a generic class Foo need be members of all instantiations of Foo; it is reasonable for some members to be restricted to certain instantiations. In particular, the instantiation Foo is an important distinguished instantiation. If a method m() in Foo is a member of all instantiations of Foo, we say m() is a *total* method. Otherwise we call it a *restricted* or *partial* method. For each method in an existing generic class to be anyfied, it could be made a total method in the new anyfied class, or it could be restricted to reference instantiations -- and either approach is fully compatible with existing clients and subclasses. This provides us a migration path to "leave behind" certain problematic methods (making them members only of reference instantiations), and also to add new methods as long as there is a default implementation at least for reference instantiations. As a simple illustration of this approach, let's say that we want to effectively rename the method List.remove(int) to removeAt(int). Clearly we cannot simply take away remove(int); existing clients would fail to link. Similarly we cannot simply add a new method removeAt(int) without a default; then existing subclasses would fail to compile. What we do here is (using a strawman syntax) add a new total method, with a ref default, and demote the existing method to a partial method on ref instantiations: interface List { // A new, total method void removeAt(int i); // demote the old method to ref-only // Just a strawman syntax void remove(int i); // give the new method a default in terms of the old, for ref default void removeAt(int i) { remove(i); } } Now: - Clients of List can invoke either remove(int) or removeAt(int), and both will do the same thing; this allows existing clients to (if they want) migrate away from the old method completely, since removeAt(int) is total, or keep using the old method, at their choice (IDEs can also provide migration help); - New clients of List can only use removeAt(int) -- they won't even *see* remove(int) when they ask their IDE to auto-complete -- and this is not a compatibility issue as there were no existing clients of List; - Existing (ref) subclasses of List still see the same set of abstract methods, and therefore continue to work exactly as before; - When anyfying a subclass of List, now you have to provide a new implementation for removeAt(int), but this is OK because *you* have decided to anyfy your class, and it is reasonable to expect some possible code changes in this situation; - Further, AbstractList can insulate most subclasses from this change. This gives us at least two tactics for dealing with problematic methods: - Migration: leave the old method in the "ref layer", and create a new total method with a ref-default in the "any layer", as with the removeAt example above; - Abandonment: leave the old method in the ref layer, and do nothing else, which may be suitable if an alternate idiom already exists (e.g., we could abandon removeAll because now we can express removeAll(c) as removeIf(c::contains), without adding any new method.) With the controlled-migration option, the primary impediment is that many of the good names are already taken. These two tactics get us much of the way there, but not all the way there. I'm going to close this note with where these tactics get us, and then open a separate note for some of the options for the remaining cases. The following table shows some possibilities. There's plenty of time to bikeshed on the names. *Class** * *Method** * *Possible Approaches** * Collection contains(Object) Migrate to containsElement(E) remove(Object) Migrate to removeElement(E) removeAll(Collection) Migrate to removeElements(Collection). Alternately, abandon in favor of existing removeIf(Predicate). retainAll(Collection) Same containsAll(Collection) Migrate to containsElements(Collection). toArray() Nothing good yet ... toArray(T[]) Leave as is, or abandon in favor of new method toArray(IntFunction), like Stream has. List remove(int) Leave as is, or migrate to removeAt(int). indexOf(Object) Migrate to Optional-bearing findFirst(Predicate) lastIndexOf(Object) Migrate to findLast(Predicate) Map containsKey(Object) Migrate to hasKey(K) containsValue(Object) Migrate to hasValue(V) remove(Object) Migrate to removeMapping(K) put(K,V) Leave as is; accept that we cannot distinguish between "nothing was there before" and "default value was there before" (as is true with null today.) get(K) Migrate to one (or all) of: Optional map(K) mapOrElse(K, V) tryMap(K, Consumer) getOrDefault(Object, V) Migrate to mapOrElse, as above Queue poll(), peek() Migrate to tryPoll(Consumer) *or* optional-bearing method Deque poll(), peek() pollFirst(), pollLast() peekFirst(), peekLast() removeFirstOccurrence(), removeLastOccurrence() Migrate to predicate-accepting method, or simply migrate to new name Notes: For Collection.contains() and remove(), while it may be tempting to try and narrow the argument type from Object to T, this is likely problematic. In the following case, we can't express something that seems pretty natural: Collection animals = ... Collection dogs = ... Dog d = ... // Can't say these for (Animal a : animals) if (dogs.contains(a)) ... animals.removeIf(a -> dogs.contains(a)) Its actually pretty convenient for contains/remove to accept anything that might be a member of the collection, but Object is not the top type we're looking for here (since values require boxing to convert to Object.) So neither the status quo (accept Object) nor recasting to a T-accepting method is all that great. (But see next memo for an alternative.) Its not clear whether all the other Object-accepting methods need the same treatment, or if migration to an E-accepting method is acceptable there. For removeAll/retainAll, we probably just want to abandon in favor of the more powerful removeIf added in 8. (Kevin/Louis -- would be good to get stats on actual usage of removeAll/retainAll.) For Map.get(), it really sucks that the good name is taken but the existing signature is terminally polluted with nullness (even for non-null-supporting maps.) We can migrate to multiple new methods (most of which have defaults in terms of the others): Optional map(K k) V mapOrElse(K k, V defaultV) boolean tryMap(K k, Consumer) With Optional as a value type, there is essentially no cost (either footprint or invocation overhead) for using Optional over a naked reference. So the first sig is not as problematic as it might look. The second has an obvious default in terms of the first. The Optional version, though, has an issue: if the map can contain null values, there's no way to express that. However, I think its OK if this method throws on null values -- the other three get-like methods (the two here plus legacy get()) can all express null values. (So null-lovers don't get to enjoy the Optional goodness. More motivation for them to give up on their nulls!) From brian.goetz at oracle.com Wed Dec 16 17:21:35 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 16 Dec 2015 12:21:35 -0500 Subject: Migrating methods in Collections Message-ID: <56719D9F.1060406@oracle.com> [ sending again with HTML enabled so the tables don't get mangled ] I'd like to start with the "What about Collections" discussion. While this topic is inextricably linked to all the other topics (value types, generic specialization, nullability, etc), to avoid getting overwhelmed, let's start by trying to keep our focus on the Collections API and the process of migrating it into an anyfied world. So, I'll offer an ahead-of-time warning: there's going to be lots here that you will want to question, and to many of those questions I'm going to answer "OK, but let's come back to that later", because I'd like to stay focused for the time being on the Collection-related issues. (Let's also try to create separate threads for separate discussion topics, though I know this is harder to remember to do.) Most of these are already prototyped in some form in the Valhalla repos (which at least demonstrates that they are plausible.) BackgroundMaterial ------------------- The last "State of the Specialization" (http://cr.openjdk.java.net/~briangoetz/valhalla/specialization.html, Dec 2014) outlines an early approach towards the specialization challenges and features. My talk at JVMLS 2015 (https://www.youtube.com/watch?v=uNgAFSUXuwc) outlines the progress that had been made between last December and August. The State of the Values (http://cr.openjdk.java.net/~jrose/values/values-0.html) outlines the basic model for value types. If you've not read/viewed these, that's a good place to start. Core Goals ---------- The key requirement that drove many of the design choices in Java 5 when adding generics is *gradual migration compatibility*. This means it must be possible to evolve a non-generic class to be generic in a manner that is binary-compatible and source-compatible for both clients and subtypes. So an existing client or subtype should continue to work whether it is recompiled or not, and generified or not, and it should be equally practical to generify subtypes and clients at the same time as a library class, or later, or never. We adopt similar goals for "anyfication" -- anyfying a class should be binary- and source-compatible for clients and subclasses, and it should be possible to anyfy a generic class without requiring that either its clients or subclasses be anyfied, and whether or not they are recompiled. Basic Model, Ultra-summarized ----------------------------- Just as the requirement of gradual migration compatibility was a significant design constraint in Java 5 (and which pushed us strongly towards erasure), this requirement is going to be a significant constraint here. Essentially, this means that parameterizations like ArrayList need to continue to be erased (otherwise they'd not be binary compatible), but parameterizations like ArrayList need to be reified (since we cannot erase int to Object without boxing, and if boxing were good enough, we wouldn't be bothering at all.) This pushes us towards an interpretation of anyfied generics that is erased over reference instantiations and reified over value instantiations. At the same time, there is code that is valid under the assumption that T <: Object that would not be valid if T can take on 'int'. These include domain assumptions (e.g., assignment to null) and identity-based assumptions (a T can be used as a monitor lock.) Accordingly, we cannot simply reinterpret existing reference-generic code to be any-generic; we need some indication from the user that they are opting into the broadened genericity domain (and therefore willing to accept the limitations of this broadened domain.) Our working model for this is to annotate the declaration of a type variable with an "any" modifier (e.g., "class Foo"). We sometimes use the abbreviation 'tvar' to describe a type variable, and 'avar' to describe an any-type variable (similarly, 'ivar' for an inference type variable.) Conceptually, the enhanced generics model is simple: - Non-anyfied generic code means exactly the same thing in Valhalla as it does in Java 5-9; - A reference instantiation of an anyfied generic class means exactly the same thing in Valhalla as it does under erased generics; - A class or method tvar declaration can be annotated with 'any', in which case it can range over value types (including primitives) as well as reference types. There are some restrictions on what you can do to an avar-typed variable to reflect the fact that values may not be nullable, have no monitor locks, that they don't participate in a subtyping relationship with Object (instead, they have a boxing conversion to Object), and that a V[] is not a subtype of Object[]; - There is a new wildcard type, Foo, which is a supertype of both value and reference instantiations of Foo. The enhanced model embraces erasure more explicitly than the current model. Migration Challenges -------------------- Ideally, we could just sprinkle 'any' over our codebase, and be done -- and hopefully for many classes, this is all that will be required. But for some classes, there may be migration challenges, which come in two forms: - A method signature is fundamentally incompatible with anyfication (such as Map.get(), which returns null when the mapping is not present, which makes no sense if V=int); - The existing implementation appeals to assumptions of object-ness, and needs to be adjusted. For the purposes of this memo, I'm going to restrict myself to the first category; we'll come back to the second category later. Here's a more-or-less complete list of methods in Collections that may need some sort of migration help. A blank cell indicates "same as above." *Class** * *Method** * *Issues** * Collection contains(Object) Assumes all Rs are castable to Object without boxing. remove(Object) removeAll(Collection) Should take Collection. Collection not a supertype of all instantiations. retainAll(Collection) containsAll(Collection) toArray() Returns Object[], which is not a supertype of V[] for any value type V. toArray(T[]) Not strictly problematic, but relies on runtime checking that E <: T and on reflection for implementation. List remove(int) Not strictly problematic, but if remove(Object) becomes remove(T), will be a confusing overload. indexOf(Object) Same as contains(Object). Also, would be better as a lambda-consuming method, now that we have lambdas. lastIndexOf(Object) Map containsKey(Object) Same as contains(Object) containsValue(Object) remove(Object) put(K,V) Returns what was there before, or null if there was no mapping. Not strictly problematic, but doesn't project so well to non-nullable Vs. putIfAbsent, replace, computeIfAbsent get(K) Uses null to signal no mapping getOrDefault(Object, V) Same as contains(Object) Queue poll(),peek() Uses null to signal no element Deque poll(), peek() pollFirst(), pollLast() peekFirst(), peekLast() removeFirstOccurrence(), removeLastOccurrence() Same as remove(Object) It has often been suggested that "maybe it is time for Collections 2.0"; some would like to declare this to be the end of the road for existing collections. However, redoing collections is only part of the battle; the Collection types are riddled throughout the JDK and other libraries, so we would need a mechanism for migrating all the methods that consume/dispense collections, and if we were to build new collections, saddling them with compatibility with old collections would be a big stone to hang around their neck. So I think this path is less appealing that it might first appear. However, the techniques described here allow us to fix some of the errors of the past. Partial Methods --------------- Our approach is guided by the following observation: not all methods declared in a generic class Foo need be members of all instantiations of Foo; it is reasonable for some members to be restricted to certain instantiations. In particular, the instantiation Foo is an important distinguished instantiation. If a method m() in Foo is a member of all instantiations of Foo, we say m() is a *total* method. Otherwise we call it a *restricted* or *partial* method. For each method in an existing generic class to be anyfied, it could be made a total method in the new anyfied class, or it could be restricted to reference instantiations -- and either approach is fully compatible with existing clients and subclasses. This provides us a migration path to "leave behind" certain problematic methods (making them members only of reference instantiations), and also to add new methods as long as there is a default implementation at least for reference instantiations. As a simple illustration of this approach, let's say that we want to effectively rename the method List.remove(int) to removeAt(int). Clearly we cannot simply take away remove(int); existing clients would fail to link. Similarly we cannot simply add a new method removeAt(int) without a default; then existing subclasses would fail to compile. What we do here is (using a strawman syntax) add a new total method, with a ref default, and demote the existing method to a partial method on ref instantiations: interface List { // A new, total method void removeAt(int i); // demote the old method to ref-only // Just a strawman syntax void remove(int i); // give the new method a default in terms of the old, for ref default void removeAt(int i) { remove(i); } } Now: - Clients of List can invoke either remove(int) or removeAt(int), and both will do the same thing; this allows existing clients to (if they want) migrate away from the old method completely, since removeAt(int) is total, or keep using the old method, at their choice (IDEs can also provide migration help); - New clients of List can only use removeAt(int) -- they won't even *see* remove(int) when they ask their IDE to auto-complete -- and this is not a compatibility issue as there were no existing clients of List; - Existing (ref) subclasses of List still see the same set of abstract methods, and therefore continue to work exactly as before; - When anyfying a subclass of List, now you have to provide a new implementation for removeAt(int), but this is OK because *you* have decided to anyfy your class, and it is reasonable to expect some possible code changes in this situation; - Further, AbstractList can insulate most subclasses from this change. This gives us at least two tactics for dealing with problematic methods: - Migration: leave the old method in the "ref layer", and create a new total method with a ref-default in the "any layer", as with the removeAt example above; - Abandonment: leave the old method in the ref layer, and do nothing else, which may be suitable if an alternate idiom already exists (e.g., we could abandon removeAll because now we can express removeAll(c) as removeIf(c::contains), without adding any new method.) With the controlled-migration option, the primary impediment is that many of the good names are already taken. These two tactics get us much of the way there, but not all the way there. I'm going to close this note with where these tactics get us, and then open a separate note for some of the options for the remaining cases. The following table shows some possibilities. There's plenty of time to bikeshed on the names. *Class** * *Method** * *Possible Approaches** * Collection contains(Object) Migrate to containsElement(E) remove(Object) Migrate to removeElement(E) removeAll(Collection) Migrate to removeElements(Collection). Alternately, abandon in favor of existing removeIf(Predicate). retainAll(Collection) Same containsAll(Collection) Migrate to containsElements(Collection). toArray() Nothing good yet ... toArray(T[]) Leave as is, or abandon in favor of new method toArray(IntFunction), like Stream has. List remove(int) Leave as is, or migrate to removeAt(int). indexOf(Object) Migrate to Optional-bearing findFirst(Predicate) lastIndexOf(Object) Migrate to findLast(Predicate) Map containsKey(Object) Migrate to hasKey(K) containsValue(Object) Migrate to hasValue(V) remove(Object) Migrate to removeMapping(K) put(K,V) Leave as is; accept that we cannot distinguish between "nothing was there before" and "default value was there before" (as is true with null today.) get(K) Migrate to one (or all) of: Optional map(K) mapOrElse(K, V) tryMap(K, Consumer) getOrDefault(Object, V) Migrate to mapOrElse, as above Queue poll(), peek() Migrate to tryPoll(Consumer) *or* optional-bearing method Deque poll(), peek() pollFirst(), pollLast() peekFirst(), peekLast() removeFirstOccurrence(), removeLastOccurrence() Migrate to predicate-accepting method, or simply migrate to new name Notes: For Collection.contains() and remove(), while it may be tempting to try and narrow the argument type from Object to T, this is likely problematic. In the following case, we can't express something that seems pretty natural: Collection animals = ... Collection dogs = ... Dog d = ... // Can't say these for (Animal a : animals) if (dogs.contains(a)) ... animals.removeIf(a -> dogs.contains(a)) Its actually pretty convenient for contains/remove to accept anything that might be a member of the collection, but Object is not the top type we're looking for here (since values require boxing to convert to Object.) So neither the status quo (accept Object) nor recasting to a T-accepting method is all that great. (But see next memo for an alternative.) Its not clear whether all the other Object-accepting methods need the same treatment, or if migration to an E-accepting method is acceptable there. For removeAll/retainAll, we probably just want to abandon in favor of the more powerful removeIf added in 8. (Kevin/Louis -- would be good to get stats on actual usage of removeAll/retainAll.) For Map.get(), it really sucks that the good name is taken but the existing signature is terminally polluted with nullness (even for non-null-supporting maps.) We can migrate to multiple new methods (most of which have defaults in terms of the others): Optional map(K k) V mapOrElse(K k, V defaultV) boolean tryMap(K k, Consumer) With Optional as a value type, there is essentially no cost (either footprint or invocation overhead) for using Optional over a naked reference. So the first sig is not as problematic as it might look. The second has an obvious default in terms of the first. The Optional version, though, has an issue: if the map can contain null values, there's no way to express that. However, I think its OK if this method throws on null values -- the other three get-like methods (the two here plus legacy get()) can all express null values. (So null-lovers don't get to enjoy the Optional goodness. More motivation for them to give up on their nulls!) From brian.goetz at oracle.com Wed Dec 16 17:23:43 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 16 Dec 2015 12:23:43 -0500 Subject: Migrating methods in Collections Message-ID: <56719E1F.2050504@oracle.com> [ sending again with HTML enabled so the tables don't get mangled ... and again ] I'd like to start with the "What about Collections" discussion. While this topic is inextricably linked to all the other topics (value types, generic specialization, nullability, etc), to avoid getting overwhelmed, let's start by trying to keep our focus on the Collections API and the process of migrating it into an anyfied world. So, I'll offer an ahead-of-time warning: there's going to be lots here that you will want to question, and to many of those questions I'm going to answer "OK, but let's come back to that later", because I'd like to stay focused for the time being on the Collection-related issues. (Let's also try to create separate threads for separate discussion topics, though I know this is harder to remember to do.) Most of these are already prototyped in some form in the Valhalla repos (which at least demonstrates that they are plausible.) BackgroundMaterial ------------------- The last "State of the Specialization" (http://cr.openjdk.java.net/~briangoetz/valhalla/specialization.html, Dec 2014) outlines an early approach towards the specialization challenges and features. My talk at JVMLS 2015 (https://www.youtube.com/watch?v=uNgAFSUXuwc) outlines the progress that had been made between last December and August. The State of the Values (http://cr.openjdk.java.net/~jrose/values/values-0.html) outlines the basic model for value types. If you've not read/viewed these, that's a good place to start. Core Goals ---------- The key requirement that drove many of the design choices in Java 5 when adding generics is *gradual migration compatibility*. This means it must be possible to evolve a non-generic class to be generic in a manner that is binary-compatible and source-compatible for both clients and subtypes. So an existing client or subtype should continue to work whether it is recompiled or not, and generified or not, and it should be equally practical to generify subtypes and clients at the same time as a library class, or later, or never. We adopt similar goals for "anyfication" -- anyfying a class should be binary- and source-compatible for clients and subclasses, and it should be possible to anyfy a generic class without requiring that either its clients or subclasses be anyfied, and whether or not they are recompiled. Basic Model, Ultra-summarized ----------------------------- Just as the requirement of gradual migration compatibility was a significant design constraint in Java 5 (and which pushed us strongly towards erasure), this requirement is going to be a significant constraint here. Essentially, this means that parameterizations like ArrayList need to continue to be erased (otherwise they'd not be binary compatible), but parameterizations like ArrayList need to be reified (since we cannot erase int to Object without boxing, and if boxing were good enough, we wouldn't be bothering at all.) This pushes us towards an interpretation of anyfied generics that is erased over reference instantiations and reified over value instantiations. At the same time, there is code that is valid under the assumption that T <: Object that would not be valid if T can take on 'int'. These include domain assumptions (e.g., assignment to null) and identity-based assumptions (a T can be used as a monitor lock.) Accordingly, we cannot simply reinterpret existing reference-generic code to be any-generic; we need some indication from the user that they are opting into the broadened genericity domain (and therefore willing to accept the limitations of this broadened domain.) Our working model for this is to annotate the declaration of a type variable with an "any" modifier (e.g., "class Foo"). We sometimes use the abbreviation 'tvar' to describe a type variable, and 'avar' to describe an any-type variable (similarly, 'ivar' for an inference type variable.) Conceptually, the enhanced generics model is simple: - Non-anyfied generic code means exactly the same thing in Valhalla as it does in Java 5-9; - A reference instantiation of an anyfied generic class means exactly the same thing in Valhalla as it does under erased generics; - A class or method tvar declaration can be annotated with 'any', in which case it can range over value types (including primitives) as well as reference types. There are some restrictions on what you can do to an avar-typed variable to reflect the fact that values may not be nullable, have no monitor locks, that they don't participate in a subtyping relationship with Object (instead, they have a boxing conversion to Object), and that a V[] is not a subtype of Object[]; - There is a new wildcard type, Foo, which is a supertype of both value and reference instantiations of Foo. The enhanced model embraces erasure more explicitly than the current model. Migration Challenges -------------------- Ideally, we could just sprinkle 'any' over our codebase, and be done -- and hopefully for many classes, this is all that will be required. But for some classes, there may be migration challenges, which come in two forms: - A method signature is fundamentally incompatible with anyfication (such as Map.get(), which returns null when the mapping is not present, which makes no sense if V=int); - The existing implementation appeals to assumptions of object-ness, and needs to be adjusted. For the purposes of this memo, I'm going to restrict myself to the first category; we'll come back to the second category later. Here's a more-or-less complete list of methods in Collections that may need some sort of migration help. A blank cell indicates "same as above." *Class** * *Method** * *Issues** * Collection contains(Object) Assumes all Rs are castable to Object without boxing. remove(Object) removeAll(Collection) Should take Collection. Collection not a supertype of all instantiations. retainAll(Collection) containsAll(Collection) toArray() Returns Object[], which is not a supertype of V[] for any value type V. toArray(T[]) Not strictly problematic, but relies on runtime checking that E <: T and on reflection for implementation. List remove(int) Not strictly problematic, but if remove(Object) becomes remove(T), will be a confusing overload. indexOf(Object) Same as contains(Object). Also, would be better as a lambda-consuming method, now that we have lambdas. lastIndexOf(Object) Map containsKey(Object) Same as contains(Object) containsValue(Object) remove(Object) put(K,V) Returns what was there before, or null if there was no mapping. Not strictly problematic, but doesn't project so well to non-nullable Vs. putIfAbsent, replace, computeIfAbsent get(K) Uses null to signal no mapping getOrDefault(Object, V) Same as contains(Object) Queue poll(),peek() Uses null to signal no element Deque poll(), peek() pollFirst(), pollLast() peekFirst(), peekLast() removeFirstOccurrence(), removeLastOccurrence() Same as remove(Object) It has often been suggested that "maybe it is time for Collections 2.0"; some would like to declare this to be the end of the road for existing collections. However, redoing collections is only part of the battle; the Collection types are riddled throughout the JDK and other libraries, so we would need a mechanism for migrating all the methods that consume/dispense collections, and if we were to build new collections, saddling them with compatibility with old collections would be a big stone to hang around their neck. So I think this path is less appealing that it might first appear. However, the techniques described here allow us to fix some of the errors of the past. Partial Methods --------------- Our approach is guided by the following observation: not all methods declared in a generic class Foo need be members of all instantiations of Foo; it is reasonable for some members to be restricted to certain instantiations. In particular, the instantiation Foo is an important distinguished instantiation. If a method m() in Foo is a member of all instantiations of Foo, we say m() is a *total* method. Otherwise we call it a *restricted* or *partial* method. For each method in an existing generic class to be anyfied, it could be made a total method in the new anyfied class, or it could be restricted to reference instantiations -- and either approach is fully compatible with existing clients and subclasses. This provides us a migration path to "leave behind" certain problematic methods (making them members only of reference instantiations), and also to add new methods as long as there is a default implementation at least for reference instantiations. As a simple illustration of this approach, let's say that we want to effectively rename the method List.remove(int) to removeAt(int). Clearly we cannot simply take away remove(int); existing clients would fail to link. Similarly we cannot simply add a new method removeAt(int) without a default; then existing subclasses would fail to compile. What we do here is (using a strawman syntax) add a new total method, with a ref default, and demote the existing method to a partial method on ref instantiations: interface List { // A new, total method void removeAt(int i); // demote the old method to ref-only // Just a strawman syntax void remove(int i); // give the new method a default in terms of the old, for ref default void removeAt(int i) { remove(i); } } Now: - Clients of List can invoke either remove(int) or removeAt(int), and both will do the same thing; this allows existing clients to (if they want) migrate away from the old method completely, since removeAt(int) is total, or keep using the old method, at their choice (IDEs can also provide migration help); - New clients of List can only use removeAt(int) -- they won't even *see* remove(int) when they ask their IDE to auto-complete -- and this is not a compatibility issue as there were no existing clients of List; - Existing (ref) subclasses of List still see the same set of abstract methods, and therefore continue to work exactly as before; - When anyfying a subclass of List, now you have to provide a new implementation for removeAt(int), but this is OK because *you* have decided to anyfy your class, and it is reasonable to expect some possible code changes in this situation; - Further, AbstractList can insulate most subclasses from this change. This gives us at least two tactics for dealing with problematic methods: - Migration: leave the old method in the "ref layer", and create a new total method with a ref-default in the "any layer", as with the removeAt example above; - Abandonment: leave the old method in the ref layer, and do nothing else, which may be suitable if an alternate idiom already exists (e.g., we could abandon removeAll because now we can express removeAll(c) as removeIf(c::contains), without adding any new method.) With the controlled-migration option, the primary impediment is that many of the good names are already taken. These two tactics get us much of the way there, but not all the way there. I'm going to close this note with where these tactics get us, and then open a separate note for some of the options for the remaining cases. The following table shows some possibilities. There's plenty of time to bikeshed on the names. *Class** * *Method** * *Possible Approaches** * Collection contains(Object) Migrate to containsElement(E) remove(Object) Migrate to removeElement(E) removeAll(Collection) Migrate to removeElements(Collection). Alternately, abandon in favor of existing removeIf(Predicate). retainAll(Collection) Same containsAll(Collection) Migrate to containsElements(Collection). toArray() Nothing good yet ... toArray(T[]) Leave as is, or abandon in favor of new method toArray(IntFunction), like Stream has. List remove(int) Leave as is, or migrate to removeAt(int). indexOf(Object) Migrate to Optional-bearing findFirst(Predicate) lastIndexOf(Object) Migrate to findLast(Predicate) Map containsKey(Object) Migrate to hasKey(K) containsValue(Object) Migrate to hasValue(V) remove(Object) Migrate to removeMapping(K) put(K,V) Leave as is; accept that we cannot distinguish between "nothing was there before" and "default value was there before" (as is true with null today.) get(K) Migrate to one (or all) of: Optional map(K) mapOrElse(K, V) tryMap(K, Consumer) getOrDefault(Object, V) Migrate to mapOrElse, as above Queue poll(), peek() Migrate to tryPoll(Consumer) *or* optional-bearing method Deque poll(), peek() pollFirst(), pollLast() peekFirst(), peekLast() removeFirstOccurrence(), removeLastOccurrence() Migrate to predicate-accepting method, or simply migrate to new name Notes: For Collection.contains() and remove(), while it may be tempting to try and narrow the argument type from Object to T, this is likely problematic. In the following case, we can't express something that seems pretty natural: Collection animals = ... Collection dogs = ... Dog d = ... // Can't say these for (Animal a : animals) if (dogs.contains(a)) ... animals.removeIf(a -> dogs.contains(a)) Its actually pretty convenient for contains/remove to accept anything that might be a member of the collection, but Object is not the top type we're looking for here (since values require boxing to convert to Object.) So neither the status quo (accept Object) nor recasting to a T-accepting method is all that great. (But see next memo for an alternative.) Its not clear whether all the other Object-accepting methods need the same treatment, or if migration to an E-accepting method is acceptable there. For removeAll/retainAll, we probably just want to abandon in favor of the more powerful removeIf added in 8. (Kevin/Louis -- would be good to get stats on actual usage of removeAll/retainAll.) For Map.get(), it really sucks that the good name is taken but the existing signature is terminally polluted with nullness (even for non-null-supporting maps.) We can migrate to multiple new methods (most of which have defaults in terms of the others): Optional map(K k) V mapOrElse(K k, V defaultV) boolean tryMap(K k, Consumer) With Optional as a value type, there is essentially no cost (either footprint or invocation overhead) for using Optional over a naked reference. So the first sig is not as problematic as it might look. The second has an obvious default in terms of the first. The Optional version, though, has an issue: if the map can contain null values, there's no way to express that. However, I think its OK if this method throws on null values -- the other three get-like methods (the two here plus legacy get()) can all express null values. (So null-lovers don't get to enjoy the Optional goodness. More motivation for them to give up on their nulls!) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Dec 16 19:43:04 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 16 Dec 2015 14:43:04 -0500 Subject: Migrating methods in Collections In-Reply-To: <56719E1F.2050504@oracle.com> References: <56719E1F.2050504@oracle.com> Message-ID: <5671BEC8.8010508@oracle.com> The previous memo outlined a tactic for effectively "migrating" a method in a current generic class to a related but different signature in an any-generic class, while retaining source and binary compatibility with existing clients and subclasses, and outlined the list of possibly problematic methods in the Collections framework. The effectiveness of this tactic on this set of methods ranges from "slam-dunk" to "seems like it could work" to "doesn't help." It would be nice to have one hammer that pounds down all the nails, but I'm not sure there is one. This memo outlines a range of complementary tactics that might, in combination with the above approach, enable us to cover the waterfront. I'll start with the method Collection.contains(Object). It might at first seem that we want the signature here to be contains(E), but this gets in the way of cases like: dogs.contains(animal) or of converting dogs::contains to a Predicate(which is what filter/removeIf would want.) Note that this has little to do directly with value types; value types are invariant. But if we want a single contains() method that ranges over any E, it needs to accomodate variance for reference instantiations while not falling back on Object as a top type. One technique for doing so would be to introduce contravariant inference variables. Then we could write contains as: boolean contains(U u) This has three main downsides: - Even though this works, its still not that obvious. - It's a *lot* of work in the spec and compiler; it pushes on all the fragilebits. - If that weren't enough, it is a theoretical minefield. Papers like http://www.cis.upenn.edu/~bcpierce/papers/variance.pdf show that adding contravariance to certain type systems result in subtyping becoming undecidable. On the other hand, I'm sure many library writers would jump for joy to have this in the toolbox; the lack of contravariant tvars seems a notable inconsistency in the language. (But let's not kid ourselves about the costs.) It just so happens that this construct works out; it is binary- and source- compatible to make a non-generic method generic, as long as the erasure of the signature remains the same. So changing contains() or remove() as above would not cause subclasses or clients to fail to either link or recompile (in the subclass recompilation case, it would be reinterpreted as a raw override, which is allowed and compatible.) And we'd end up with a total method that does the right thing both for refs and values. Ignoring the costs and risks, this technique applies to a number of the methods in our rogue's gallery, including toArray(), for which we didn't yet have a solution: U[] toArray() This is compatible with existing clients who are expecting an Object[] to come back from toArray() on a collection of reference types, and collapses to V[] for any value type, so Collection.toArray() returns int[]. (We might still want an unchecked warning if the compiler infers U != Object for reference E, but that's a separate and easily handled consideration.) Let's call this technique "superation" (yes, its an intentional (disgusting) pun. See http://beta.merriam-webster.com/dictionary/suppurate. And think about that the next time you pass a "Super 8" motel on the highway.) With this in our toolbox, the strategy matrix becomes: *Class** * *Method** * *Possible Approaches** * Collection contains(Object) Superateto contains(U) remove(Object) removeAll(Collection) Abandon in favor of existing removeIf(Predicate). retainAll(Collection) containsAll(Collection) Migrate to containsElements(Collection), or abandon. toArray() Superate to U[] toArray() toArray(T[]) Leave as is, superate, or abandon. List remove(int) Migrate to removeAt(int). indexOf(Object) Migrate to Optional-bearing findFirst(Predicate) lastIndexOf(Object) Map containsKey(Object) Superate containsValue(Object) Superate remove(Object) Superate put(K,V) Leave as is get(K) Migrate to one (or all) of: Optional map(K) mapOrElse(K, V) tryMap(K, Consumer) getOrDefault(Object, V) Migrate to mapOrElse, or superate Queue poll(), peek() Migrate to tryPoll(Consumer) *or* optional-bearing method Deque poll(), peek() pollFirst(), pollLast() peekFirst(), peekLast() removeFirstOccurrence(), removeLastOccurrence() Migrate to predicate-accepting method, or superate This is amore satisfying matrix; not only does everything have an acceptable strategy, but some have more than one, and the user impact of superating a method is lower (users might just not notice), so the perception is that fewer methods are affected. Still, super-bounded tvars are a big hammer for such a small foe. Maybe there's an alternate approach that has the effect of superation but doesn't need such a big hammer. Here's one possibility. We already have a notion of partial methods. We could have a pair of methods Object[] toArray() E[] toArray() both of which are reasonable signatures for their restricted domains. Unfortunately, the natural interpretation of this pair of methods is that the first is a member of Collection, and the second is a member of Collection, but there is *no* toArray() method that is a member of Collection! This means that code that is generic in any-T would not see a toArray() method at all. That's a problem (though not as enormous as it initially sounds, there are possible mitigating techniques.) However, it is not unreasonable for the compiler to recognize this situation and deal. Suppose I have some code generic in any-T: void foo(Collection c) { T[] arr = c.toArray(); } Now, the compiler doesn't know whether c is a collection of refs or values, but it knows it's one or the other (ref T and val T form a partition of any T). So it could (and in some cases, has to anyway) do type checking by parts -- it can typecheck the above assignment under the assumption of ref T, and do it again under the assumption of val T, and if both succeed (and something else, see below), accept the method invocation as valid. (In this case, the ref-T fork should result in an unchecked warning, meaning that the merged checking also yields an unchecked warning.) The "something else" part is: when doing overload resolution by parts, both branches must resolve to overloads that are erasure-equivalent to each other. Which is true for toArray() (and for all the cases for which superation would work.) Now, this is a lot of handwaving, and it doesn't even really describe how we think partial methods should actually work (I'd like to get rid of the where-val-T slices entirely, this is a separate discussion.) But its a sketch of an option that achieves the positive result of superation without engaging the complexity of superation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dl at cs.oswego.edu Fri Dec 18 15:50:07 2015 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 18 Dec 2015 10:50:07 -0500 Subject: Migrating methods in Collections In-Reply-To: <56719E1F.2050504@oracle.com> References: <56719E1F.2050504@oracle.com> Message-ID: <56742B2F.3000809@cs.oswego.edu> Before commenting on details, a few preliminaries. 1. It seems unresolved in the current State of Values doc whether value types will have user-definable equals() methods. I think that this needs to be settled soon: If value types don't allow overriding equals, and if the implementation is "has same type and bits", then some of the problems you note almost disappear. For example c.contains(x) could be automatically translated into "false" if c is a collection of a different val type than x, or x is a ref type or null. Which also happens to catch all the type-problematic cases. I'm not sure how a compiler would know to do this though. (The complementary case where c is Collection (non-any), and x is a possibly-boxable value generalizes how Collection currently acts, which doesn't seem to need any change.) 2. It seems irresponsible to spend so much effort on Collections without also somehow addressing 32bit size/index limitations. Yes? 3. Similarly for value-like (aka fluent-immutable, aka persistent) collection methods, possibly in sub or super interfaces, or just extension methods in Collection. As in: Collection adding(T x); Collection removing(T x); (In other words, if collections support values, users will also expect value-like collections/APIs.) Existing collections might just clone-then-mutate, but others (like HAMTs that we don't support in part for lack of API) would do something cheaper. (Default implementations seem possible, but only via messy reflection.) Sorry that (2) and (3) are almost out of scope of this discussion, but merely "almost" -- they seem to interact at least a little. -Doug From brian.goetz at oracle.com Fri Dec 18 16:55:12 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 18 Dec 2015 11:55:12 -0500 Subject: 64 bit collections, and API migration in general (was Re: Migrating methods in Collections) In-Reply-To: <56742B2F.3000809@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> Message-ID: <56743A70.1000401@oracle.com> > 2. It seems irresponsible to spend so much effort on > Collections without also somehow addressing 32bit size/index > limitations. Yes? I think that's really a separate question. While everything said so far is Collection-specific, it's not really "so much effort on Collections" as much as "so much effort to ensure that legacy libraries can be compatibly anyfied", and that Collections is the poster child for that effort. (If we can't migrate Collections, that's evidence that we're still lacking in linguistic tools for supporting the transition to anyfied generics.) So I'll interpret your question as: "These are nice migration tools for migrating erased libraries to anyfied, but there are other migrations we'd like to perform on these aging libraries, please don't forget about them?" The migration in question is whether we can compatibly migrate methods like: get(int index) to get(long index) For API migration, there is a 2x2 compatibility matrix: {source,binary} x {client,subclass}. The hard quadrant of this is almost always "binary compatibility for subclasses"; the others can usually be handled by some combination of bridge methods, defaults, and compiler fu. Essentially, the nasty case comes about when you have some combination of A extends B extends C where some of these have not been recompiled, and someone ends up overriding a bridge instead of the real method, and you can end up invoking the wrong method. I'll provide more details soon, but let's come back to this under the more general topic of signature migration -- which we're going to need in order for Optional and friends to become values anyway. OK? From brian.goetz at oracle.com Fri Dec 18 16:55:18 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 18 Dec 2015 11:55:18 -0500 Subject: Equality (was: Re: Migrating methods in Collections) In-Reply-To: <56742B2F.3000809@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> Message-ID: <56743A76.6000007@oracle.com> > 1. It seems unresolved in the current State of Values doc > whether value types will have user-definable equals() > methods. I think that this needs to be settled soon: > > If value types don't allow overriding equals, and if the implementation > is "has same type and bits", then some of the problems you note > almost disappear. For example c.contains(x) could be automatically > translated into "false" if c is a collection of a different val type > than x, or x is a ref type or null. Which also happens to catch all > the type-problematic cases. I'm not sure how a compiler would know > to do this though. Here's the current thinking on the tools for equality: - The bytecode set will provide sort of 'vcmpeq' instruction, whose behavior is a componentwise recursive comparison (int fields with icmp, value fields with vcmp, etc). - The == operator in the language will correspond to vcmpeq - The default (whether provided by javac or VM) implementation of equals(V) for value types will do an == comparison - Users can override equals(V) The motivation for allowing overriding equals is the same as for objects. Obvious examples include Decimal(1.0) and Decimal(1.00), and Tuple[String,String] that both contain [ foo, bar ] but use different String instances to do so. On the signature of equality, equals() has potentially the same issue as contains(), where you might want to accept a broader set of comparands. Still figuring out the options there. From dl at cs.oswego.edu Fri Dec 18 17:41:24 2015 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 18 Dec 2015 12:41:24 -0500 Subject: Equality In-Reply-To: <56743A76.6000007@oracle.com> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> Message-ID: <56744544.1050603@cs.oswego.edu> On 12/18/2015 11:55 AM, Brian Goetz wrote: > Here's the current thinking on the tools for equality: > > - The bytecode set will provide sort of 'vcmpeq' instruction, whose behavior > is a componentwise recursive comparison (int fields with icmp, value fields with > vcmp, etc). > - The == operator in the language will correspond to vcmpeq > - The default (whether provided by javac or VM) implementation of equals(V) > for value types will do an == comparison > - Users can override equals(V) > > The motivation for allowing overriding equals is the same as for objects. > Obvious examples include Decimal(1.0) and Decimal(1.00), and > Tuple[String,String] that both contain [ foo, bar ] but use different String > instances to do so. > > On the signature of equality, equals() has potentially the same issue as > contains(), where you might want to accept a broader set of comparands. Still > figuring out the options there. Limiting value type V to only override equals(V x) seems to have the same simplifying impact on Collection.contains and others. Yes? -Doug From brian.goetz at oracle.com Fri Dec 18 17:48:35 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 18 Dec 2015 12:48:35 -0500 Subject: Equality In-Reply-To: <56744544.1050603@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> Message-ID: <567446F3.7000904@oracle.com> > Limiting value type V to only override equals(V x) seems to have > the same simplifying impact on Collection.contains and others. Yes? In our attempt on Collections, we found that contains(V), while prettier, might not be enough. In particular, it wasn't enough for providing the skeletal implementation of removeAll(Collection) in AbstractCollection, which looks like: Iterator it = iterator(); while (it.hasNext()) { if (c.contains(it.next())) { it.remove(); modified =true; } } Here, c is a Collection, so its contains method would be contains(capture(? extends E)), and it.next() returns a E, so the compiler doesn't like it. If I found this idiom in the first few minutes of trying to port collections, I'm guessing it will occur elsewhere too. So perhaps what this says is we are going to get pushed in the other direction -- that we'll want to superate equals(). -------------- next part -------------- An HTML attachment was scrubbed... URL: From dl at cs.oswego.edu Fri Dec 18 20:21:54 2015 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 18 Dec 2015 15:21:54 -0500 Subject: Equality In-Reply-To: <567446F3.7000904@oracle.com> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> Message-ID: <56746AE2.9070403@cs.oswego.edu> On 12/18/2015 12:48 PM, Brian Goetz wrote: > So perhaps what this says is we are going to get pushed in the other direction > -- that we'll want to superate equals(). > What if plain-equals and value-equals are independently overridable, with defaults: boolean equals(any x) { return (x instanceof ThisClass) && equalValue(x); } boolean equalValue(ThisClass x) { ... check bit equality ... } (The equalValue method could be named "equals" too, but doing so is too confusing for now.) Where operator== calls equalValue, not plain equals. The effects cascade to many of the Collection methods. -Doug From brian.goetz at oracle.com Fri Dec 18 21:05:49 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 18 Dec 2015 16:05:49 -0500 Subject: Equality In-Reply-To: <56746AE2.9070403@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> <56746AE2.9070403@cs.oswego.edu> Message-ID: <5674752D.8050800@oracle.com> Good thought. Let me give it a nudge in a direction: we'd like to get away from approaches that amount to partitioning on ref/val, and instead steer towards "here's the universal version, and here's the special-case version for refs that 'overrides' the universal version", where overriding could include things like perturbing the signature (as covariant overrides do.) Recasting in this light: boolean equals(Something o) { ... } // total method boolean equals(Object o) { ... } // ref-override On 12/18/2015 3:21 PM, Doug Lea wrote: > On 12/18/2015 12:48 PM, Brian Goetz wrote: > >> So perhaps what this says is we are going to get pushed in the other >> direction >> -- that we'll want to superate equals(). >> > > What if plain-equals and value-equals are independently overridable, > with defaults: > boolean equals(any x) { return (x instanceof ThisClass) && > equalValue(x); } > boolean equalValue(ThisClass x) { ... check bit equality ... } > > (The equalValue method could be named "equals" too, but doing so > is too confusing for now.) > Where operator== calls equalValue, not plain equals. > > The effects cascade to many of the Collection methods. > > -Doug > > From john.r.rose at oracle.com Sat Dec 19 10:13:07 2015 From: john.r.rose at oracle.com (John Rose) Date: Sat, 19 Dec 2015 02:13:07 -0800 Subject: Equality (was: Re: Migrating methods in Collections) In-Reply-To: <56743A76.6000007@oracle.com> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> Message-ID: <314B3C96-C984-4512-A155-DFF3AB4C5F67@oracle.com> On Dec 18, 2015, at 8:55 AM, Brian Goetz wrote: > >> 1. It seems unresolved in the current State of Values doc >> whether value types will have user-definable equals() >> methods. I think that this needs to be settled soon: >> >> If value types don't allow overriding equals, and if the implementation >> is "has same type and bits", then some of the problems you note >> almost disappear. For example c.contains(x) could be automatically >> translated into "false" if c is a collection of a different val type >> than x, or x is a ref type or null. Which also happens to catch all >> the type-problematic cases. I'm not sure how a compiler would know >> to do this though. > > Here's the current thinking on the tools for equality: > > - The bytecode set will provide sort of 'vcmpeq' instruction, whose behavior is a componentwise recursive comparison (int fields with icmp, value fields with vcmp, etc). > - The == operator in the language will correspond to vcmpeq > - The default (whether provided by javac or VM) implementation of equals(V) for value types will do an == comparison > - Users can override equals(V) "Codes like a class" => You can override equals. There's not really an open question here. Forbidding overrides to equals would be crippling. "Works like an int" => operator== might be logically adjusted to the value type semantics. (Reminder "Codes like a class, works like an int" is the slogan which best captures in a few words what we are trying to do with value types.) With that as context, I think we have two plausible, logically consistent options: Option 1. (POR, as Brian points out) operator== is hardwired to bitwise comparison (ignoring padding, never calling equals methods) Option 2. operator== is an alias for equals, and vcmpeq is accessible but available under a different name (isSameAs). The choice here must balance two competing influences. "Codes like a class" means that, internally within the implementation of a value type, uses of operator== must be "dumb" approximations to "true" equality. Indeed, probably most occurrences of operator== on references other than null are of the form "p == q || p != null && p.equals(q)". Bad language choice here, IMO That legacy meaning of operator== pushes us towards Option 1. "Works like an int" means that, externally when people use a value type, as if it were a primitive, will just say "v == w" and not even dream that "v.equals(w)" is a thing. Exactly zero occurrences of operator== on non-references are backed up by calls to equals, and users will be surprised if a value type give incomplete answers to v==w. This practicality pushes us towards Option 2. But, if you think about it, it also pushes us towards well-controlled behavior for other operators. If I can write "v == w", what should I expect from "v < w" (if they are comparable)? Does this roll us all the way down the slippery slope to operator overloading? It had better not. There are two obvious places we could stop rolling towards (uncontrolled) operator overloading. First, only "overload" operators which are *already* common to both primitives and references. That means == and !=, and nothing else. Second, retroactively add interfaces to Byte, Boolean, Integer, Long, Float, etc., which reify all the relevant operators as named method calls. And then allow value types to overload those named methods, wiring operator uses into those methods (but continuing to hardwire the primitives to the appropriate bytecodes). I think the POR (Option 1) is reasonable, unless/until we discover evidence to the contrary as we work with generics over primitives. Finally, note that operator overloading is not just an academic or esthetic concern, because enhanced generics demand some sort of unified view of types. When we write a generic method over a type parameter , we expect the method to operate correctly over all valid bindings of T. Today, since T ranges only over references, we can assume that code that touches T will do the Option 1 dance of "v == w || v.equals(w)", across all T, even value types. Tomorrow, when T ranges over primitives, references, and values, there will be a little more pressure to "rationalize" the behavior of op<=, op+, op*, etc., so that they operates correctly over all valid bindings of T. I say "a little more" but experiment will show whether it is significant. If so, we will want to re-interpret op<=, op+, etc., as interface calls, and write generics using bounds like (getting op< <= == != >= >), (getting op+ etc.), and so on. The conservative thing to do, which might be right in the end, is to require all new code that uses etc. to always use method-call syntax on values of type T, and bring primitives into consistency by retroactively assigning the methods in Comparable, etc. (Perhaps only in generic code?) Later on we can reconsider whether rehabilitating the various infix operators (as sugar for those methods) is worth doing. The thing we must *not* do is get to a place where primitives can *only* be operated on via operators like op< op+, but values and references can *only* be operated on via method invocation. One of the two sides has to change so as to overlap (at least for generic code) with the other. ? John From john.r.rose at oracle.com Sat Dec 19 10:17:12 2015 From: john.r.rose at oracle.com (John Rose) Date: Sat, 19 Dec 2015 02:17:12 -0800 Subject: Equality In-Reply-To: <56744544.1050603@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> Message-ID: On Dec 18, 2015, at 9:41 AM, Doug Lea
wrote: > > Limiting value type V to only override equals(V x) seems to have > the same simplifying impact on Collection.contains and others. Yes? Maybe. But, Brian has written about edge cases where the symmetry of Object.equals interacts badly with ad hoc code in collections. Basically, if you write this.equals(x) you might expect to type x:(? extends V) but if you write x.equals(this) you might expect to have to say x:(? super V). After shaking things around, you give up and say x:any, which today is x:Object. In the new world we can maybe get away with non-variant equals(V x), and allow an escape hatch to the legacy world via boxing and the old equals(Object x). -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Sat Dec 19 10:23:14 2015 From: john.r.rose at oracle.com (John Rose) Date: Sat, 19 Dec 2015 02:23:14 -0800 Subject: Equality In-Reply-To: <567446F3.7000904@oracle.com> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> Message-ID: <8E87393D-17B8-4951-AE12-16A2B0A19BEC@oracle.com> On Dec 18, 2015, at 9:48 AM, Brian Goetz wrote: > > So perhaps what this says is we are going to get pushed in the other direction -- that we'll want to superate equals(). Because equals is symmetric, coders use it in both directions, and the net result is you probably end up seeing both directions at once. To put it another way, it is removeAll(Collection) because, even if equals has a "direction", you don't know in which order the operands are passed to equals, within an implementation of removeAll. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Sat Dec 19 10:26:43 2015 From: john.r.rose at oracle.com (John Rose) Date: Sat, 19 Dec 2015 02:26:43 -0800 Subject: Equality In-Reply-To: <56746AE2.9070403@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> <56746AE2.9070403@cs.oswego.edu> Message-ID: <9FA0A474-0CE4-4F13-BE46-5C4515D2CD94@oracle.com> On Dec 18, 2015, at 12:21 PM, Doug Lea
wrote: > > On 12/18/2015 12:48 PM, Brian Goetz wrote: > >> So perhaps what this says is we are going to get pushed in the other direction >> -- that we'll want to superate equals(). >> > > What if plain-equals and value-equals are independently overridable, > with defaults: > boolean equals(any x) { return (x instanceof ThisClass) && equalValue(x); } > boolean equalValue(ThisClass x) { ... check bit equality ? } (And of course in that case make "equals" be "final" above value types.) > (The equalValue method could be named "equals" too, but doing so > is too confusing for now.) > Where operator== calls equalValue, not plain equals. That would go with Option 2 in my previous mail. It's pretty and logical, but not clear yet if it is needed. > The effects cascade to many of the Collection methods. A big problem with collection methods is the effect of the symmetry of equals in the presence of subtype polymorphism. You can't tell which operand of equals is going to be a subtype of the other. ? John From john.r.rose at oracle.com Sat Dec 19 10:29:23 2015 From: john.r.rose at oracle.com (John Rose) Date: Sat, 19 Dec 2015 02:29:23 -0800 Subject: Equality In-Reply-To: <5674752D.8050800@oracle.com> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> <56746AE2.9070403@cs.oswego.edu> <5674752D.8050800@oracle.com> Message-ID: On Dec 18, 2015, at 1:05 PM, Brian Goetz wrote: > Recasting in this light: > > boolean equals(Something o) { ... } // total method > > > boolean equals(Object o) { ... } // ref-override I don't get this bit: Wouldn't you need to say in order to put the constraint on the type that "hurts" equals? And in that case, of course, you don't need at all, since you know if ThisType is ref or val. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From dl at cs.oswego.edu Sat Dec 19 16:02:21 2015 From: dl at cs.oswego.edu (Doug Lea) Date: Sat, 19 Dec 2015 11:02:21 -0500 Subject: Equality In-Reply-To: <9FA0A474-0CE4-4F13-BE46-5C4515D2CD94@oracle.com> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> <56746AE2.9070403@cs.oswego.edu> <9FA0A474-0CE4-4F13-BE46-5C4515D2CD94@oracle.com> Message-ID: <56757F8D.2030304@cs.oswego.edu> Here's an alternative that evades (or allows to be postponed) John's slippery-slope concerns about overloading "==": Anyfy equals, and adjust default implementation of boolean T.equals(any x) for type (ref or val) T to the intrinsified, otherwise inexpressible: if (bitwiseEqual(x, this)) return true; // pointer equality if ref if (!(x instanceof T)) return false; if (method T.equals(T) is not defined) return false; return equals((T)x); // call specialized override (Various optimizations may be possible.) Any class or val type could still override this, and/or introduce its T.equals(T) specialized override. It might be challenging for people to write correct, symmetric equals methods that span refs and vals, but not impossible (which is OK since it should be rare). I still think that doing something like this removes the need to specially deal with Collection.contains and related methods. -Doug From brian.goetz at oracle.com Sat Dec 19 16:38:37 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 19 Dec 2015 11:38:37 -0500 Subject: Equality In-Reply-To: <56757F8D.2030304@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> <56746AE2.9070403@cs.oswego.edu> <9FA0A474-0CE4-4F13-BE46-5C4515D2CD94@oracle.com> <56757F8D.2030304@cs.oswego.edu> Message-ID: <5675880D.4060503@oracle.com> > Anyfy equals, and adjust default implementation of boolean T.equals(any x) I think we may be talking past each other (while basically saying the same thing from opposite directions.) "any" is not a type; it is a modifier that affects the domain of type variables. So we don't have (and I think we don't want) a meaning for equals(any x). But what we do want, is a way of expressing "I'll take anything which could be on the other side of the == operator with me." For refs, that's Object; for a value type V, that's just V. Where we had gotten with the / superation idiom (not suggesting either of these syntaxes is great) is being able to express: - If T is value, then T, else the erasure of T (usually object) ** I'll write this as Sup for short. The convenient thing about Sup is that it conveniently collapses to Object in the places where we want Object, so we could define contains/remove as contains(Sup) and contains will always bottom out at equals(), so equals() similarly needs to be equals(Sup) If this is a valid approach (and I think its the best one we've got so far), then we're looking for how to spell Sup (in all of: type system, language syntax, bytecode descriptors.) > I still think that doing something like this removes the need > to specially deal with Collection.contains and related methods. I don't see it yet; those signatures are still currently contains(Object), which isn't appropriate for value types. So we have to do *something*. ** There's a lot of sloppiness in the ref/val distinction, which is going to need to be cleaned up. Sometimes when we say ref/val, we mean "erased/reified". Sometimes we mean "polymorphic/monomorphic". Sometimes we mean "nullable/non-nullable." From dl at cs.oswego.edu Sat Dec 19 18:45:41 2015 From: dl at cs.oswego.edu (Doug Lea) Date: Sat, 19 Dec 2015 13:45:41 -0500 Subject: Equality In-Reply-To: <5675880D.4060503@oracle.com> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> <56746AE2.9070403@cs.oswego.edu> <9FA0A474-0CE4-4F13-BE46-5C4515D2CD94@oracle.com> <56757F8D.2030304@cs.oswego.edu> <5675880D.4060503@oracle.com> Message-ID: <5675A5D5.6060208@cs.oswego.edu> On 12/19/2015 11:38 AM, Brian Goetz wrote: >> Anyfy equals, and adjust default implementation of boolean T.equals(any x) > "any" is not a type; it is a modifier that affects the domain of type > variables. OK. I should have first asked whether there are plans to allow forms like: boolean equals(T x) including as sugar for some superation-like construction. It would be nice to further abbreviate as "(any x)" but probably syntactically impossible. But still convenient in pseudocode discussion. > So we don't have (and I think we don't want) a meaning for > equals(any x). The equals method is special because it is the only defined Object method that can interact with Values world, and vice versa. So making it parametric across them seems necessary, even if it requires some special JVM magic. >> I still think that doing something like this removes the need >> to specially deal with Collection.contains and related methods. > > I don't see it yet; those signatures are still currently contains(Object), which > isn't appropriate for value types. So we have to do *something*. Right. I agree that the signature must compatibly change, but not necessarily that anything else does. -Doug From brian.goetz at oracle.com Sat Dec 19 22:20:48 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 19 Dec 2015 17:20:48 -0500 Subject: Equality In-Reply-To: <5675A5D5.6060208@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> <56746AE2.9070403@cs.oswego.edu> <9FA0A474-0CE4-4F13-BE46-5C4515D2CD94@oracle.com> <56757F8D.2030304@cs.oswego.edu> <5675880D.4060503@oracle.com> <5675A5D5.6060208@cs.oswego.edu> Message-ID: <5675D840.7090109@oracle.com> > OK. I should have first asked whether there are plans to allow forms > like: > boolean equals(T x) > including as sugar for some superation-like construction. Yes. That's totally valid (any is a modifier for a type variable declaration, whether introduced at the class or method level.) Its worth noting that there's some degree of extra dispatch cost for generic instance methods (a separate topic, but something to keep in mind) which is a slight negative for going generic on such a critical method as equals(). But yes, that's valid. > Right. I agree that the signature must compatibly change, > but not necessarily that anything else does. OK, we're on the same page. I think we're coming at the "superate" concept from different directions, but both roads lead there. From ron at paralleluniverse.co Sun Dec 20 17:40:40 2015 From: ron at paralleluniverse.co (Ron Pressler) Date: Sun, 20 Dec 2015 19:40:40 +0200 Subject: Migrating methods in Collections Message-ID: (Sorry if this message doesn?t appear in the same thread; I didn't get the older messages on the list). I?d like to suggest a slightly different approach to the containers migration issue (I'm not discussing equality). The partial-method idea seems a potential source of confusion to me. Unlike techniques for manual specialization (e.g. bitfields for ArrayList), here we?re talking added complexity which directly affects any interaction with a generic class ? not just its implementation. It is unencapsulated complexity, so I think it deserves careful consideration. I have a couple of ideas, each can be used in isolation or in a combination with the other, which may (or may not) be simpler: 1. We can simply not specialize the signatures of public collection methods (say, if [T] is the boxed-type of T, the signature of Map.get(Object) will be [V] get(Object)). The JVM?s ability to avoid boxing might be good enough for this to yield the performance we want. New methods can, of course, be added. This approach can be taken in addition to or instead of superation. 2. If methods are to be removed (as in made partial), instead of magically disappearing them at the call site based on usage, perhaps we should consider hiding them by source-code version (not from the class file, of course, only hiding them in javac)? This is an explicit decision to break source compatibility, but it has two mitigating factors: 1/ javac conveniently has a source level (which, I hear, will also result in hiding new methods starting with Java 9) and 2/ Java already breaks source compatibility from time to time. I had quite a few classes that didn?t compile under 8 because 8 changed the name resolution rules wrt static imports (or, more precisely, made them conform to the JLS, whereas they hadn't in prior versions). It took me some time to figure out what was wrong, but hidden methods would be able to give much better error messages. Also, the superation idea seems very interesting, but I don?t understand how it would work for contains/remove(Object), as contains needs to be able to accept both super- *and* subtypes of T (as in, animals.contains(dog)). I believe its type ? like that of equals() -- should be contains(T x) Ron -------------- next part -------------- An HTML attachment was scrubbed... URL: From dl at cs.oswego.edu Sun Dec 20 22:05:32 2015 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 20 Dec 2015 17:05:32 -0500 Subject: Migrating methods in Collections In-Reply-To: References: Message-ID: <5677262C.1010005@cs.oswego.edu> On 12/20/2015 12:40 PM, Ron Pressler wrote: > 1. We can simply not specialize the signatures of public collection methods > (say, if [T] is the boxed-type of T, the signature of Map.get(Object) will > be [V] get(Object)). One would think that the boxing of T would be an implementation of Optional, which would be incompatibly different as a signature. Although I'm not exactly sure how this would work given the compromises defining Optional necessary to get it into jdk8. -Doug From brian.goetz at oracle.com Mon Dec 21 16:20:35 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 21 Dec 2015 11:20:35 -0500 Subject: Migrating methods in Collections In-Reply-To: References: Message-ID: <567826D3.7050504@oracle.com> Totally fair to ask "have we missed a simpler choice?" But sadly I think we haven't (though I don't fully understand your second idea, I'm interested to hear more.) > here we?re talking added complexity which directly > affects any interaction with a generic class ? not just its > implementation. I think this is either not true, or overstated (depending on what you mean by "any interaction".) The only new thing that this adds (and its not really that new) is the fact that the set of members of a parameterized type depends on the parameterization. For example, List might have members removeAt(int) // new total method remove(int) // legacy method but List would only have removeAt(int) // new total method But, this sort of dependency isn't even new; from the client perspective, the signature of add in List is add(String), whereas its signature in List is add(Number). What's new is that some methods won't appear in some parameterizations. I don't think this rises to the level of "added complexity" (arguably it is even "reduced complexity"?); the user will see this as a context-dependent set of methods when they hit ctrl-space in their IDE. Just as the type signatures are already specialized to the type of the receiver in this context, methods that are not applicable will now be filtered. (Note also that when we migrate a reference-specific method to a new method, the new method is not value-specific, its total, so it can be positioned as "the old method has been deprecated in favor of this new, more flexible method.") > 1. We can simply not specialize the signatures of public collection > methods (say, if [T] is the boxed-type of T, the signature of > Map.get(Object) will be [V] get(Object)). The JVM?s ability to > avoid boxing might be good enough for this to yield the performance we > want. New methods can, of course, be added. This approach can be taken > in addition to or instead of superation. Yes, this was something we considered early on. There are several issues: - Is our box elision going to be good enough? - Nullity - Transparency Elision. Deciding to not specialize the signatures means that we're relying on box elision in the VM being good enough so that boxes are elided "almost all the time." Sadly, I do not think this is going to be the case. There are certainly reasons why box elision could be better with value-boxes (we can be more hostile to their identity, and therefore more freely elide them) than the existing wrappers, but if I have a deep chain of calls that are passing a boxed value through a library (common), there is a real risk of fall-off-the-cliff behavior when we hit our various inlining limits, and can't see that both ends of the chain prefer the unboxed variant. Further, the most important box types -- Integer and friends -- are already deeply polluted with identity (want to bet that no program ever locks on one?) So I think this one goes in the "boy, it would be nice" column, but I don't think its something we can bet the farm on. Nullity. Even if elision were perfect, Map.get is still fundamentally unrescuable, because it uses null as a return to signal non-presence. (Forcing all values to be nullable is a non-starter.) This means that we may never be able to elide the boxing in Map.get(), which would cripple map performance -- non-starter. So some sort of migration strategy is needed for Map.get() anyway -- and in fact, the "peeling" technique was invented in the context of "what about Map.get", and the rest was mostly an exploration of whether we needed any additional hammers beyond that. Transparency. Even if the above two were not issues, I think having box types (or worse, Object) show up in signatures when the user is expecting something involving T is a visible wart that the users will notice. (Users would reasonably expect a List to have methods that truck in int, not Integer, and not Object.) For these reasons, I think *some* intrusion into the API is unavoidable. The work that's gone into this draft is aimed at trying to balance compatibility with the current API (in both letter and spirit) with minimizing the warts perceived by future clients of the anyfied APIs. (Future *implementors* will experience warts, such as having to implement both flavors of remove. However, these are migration-specific warts; as new libraries are written that don't have be migrated from ref-generics, these won't even show up.) > 2. If methods are to be removed (as in made partial), instead of > magically disappearing them at the call site based on usage, perhaps we > should consider hiding them by source-code version (not from the class > file, of course, only hiding them in javac)? This is an explicit > decision to break source compatibility, but it has two mitigating > factors: 1/ javac conveniently has a source level (which, I hear, will > also result in hiding new methods starting with Java 9) and 2/ Java > already breaks source compatibility from time to time. I had quite a few > classes that didn?t compile under 8 because 8 changed the name > resolution rules wrt static imports (or, more precisely, made them > conform to the JLS, whereas they hadn't in prior versions). It took me > some time to figure out what was wrong, but hidden methods would be able > to give much better error messages. I'm not sure I'm following what problem you're trying to solve here? (This sounds a little like the tricks we did with default methods when compiling with the jdk8 compiler in -source 7 mode, where we didn't consider default methods to be members of the class for some purposes when viewed from 7 code?) Can you elaborate? > Also, the superation idea seems very interesting, but I don?t understand > how it would work for contains/remove(Object), as contains needs to be > able to accept both super- /and/ subtypes of T (as in, > animals.contains(dog)). Yeah, this is what I meant by "Even though this works, its still not that obvious." If you have animals.contains(dog) where boolean contains(U) then inference concludes U=Animal, so everything is fine. (The constraints: U :> E, E=Animal, Dog <: U). But as I said, its not obvious. (Dan likens it to F-bounds; for most people, the best they can do is learn "this is the idiom", rather than truly understand it.) Hence, this is a downside of this approach -- that even smart people will look at it and scratch their heads. > I believe its type ? like that of equals() -- > should be contains(T x) Maybe! But I think there's also a bit of Stockholm Syndrome in that thinking, that derives from a pre-generics notion of the world. In a generic world, you can use the type system to exclude the "obviously stupid" candidates, such as those that are known not to be either a subtype or a supertype of the type in question. Secondarily, there's a contingent reason why I'm nervous about such a fundamental method like Object.equals() being defined as a generic method -- when you follow the details of how any-generic methods are implemented, the invocation cost is unavoidably higher. For new code, this is probably acceptable, but for the cornerstone of the castle, it doesn't seem to be. The technique hinted at in the end of my mail is an attempt to get the benefits of superation while not having to reach for either the big contravariance hammer or the generic method hammer. The result would a single, non-generic method whose signature collapses to equals(Object) for reference types and equals(V) for value types. (All the animals.contains(dog) examples only show up when there's variance, and value types are monomorphic, so they don't have to deal with superclasses or subclasses showing up.) If we can make this work, this seems preferable to any of the options explored previously. From brian.goetz at oracle.com Mon Dec 21 16:24:33 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 21 Dec 2015 11:24:33 -0500 Subject: Migrating methods in Collections In-Reply-To: <5677262C.1010005@cs.oswego.edu> References: <5677262C.1010005@cs.oswego.edu> Message-ID: <567827C1.9010900@oracle.com> > One would think that the boxing of T would be an implementation of > Optional, which would be incompatibly different as a signature. Right, that's the thinking towards "migrate map()V to one or more other method." The existing map() is irretrievably null-polluted; write some new value-friendly methods. One of the forms uses Optional; this assumes we can migrate Optional to be a value type in Valhalla (requiring additional migration tools, along the lines alluded to when you brought up collection index sizes.) > Although I'm not exactly sure how this would work given the compromises > defining Optional necessary to get it into jdk8. Right. As a reference type, Optional is a box, and so while more expressive than Integer, is no lighter. As a value type, different story. From dl at cs.oswego.edu Mon Dec 21 16:50:56 2015 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 21 Dec 2015 11:50:56 -0500 Subject: Equality In-Reply-To: <5675D840.7090109@oracle.com> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> <56746AE2.9070403@cs.oswego.edu> <9FA0A474-0CE4-4F13-BE46-5C4515D2CD94@oracle.com> <56757F8D.2030304@cs.oswego.edu> <5675880D.4060503@oracle.com> <5675A5D5.6060208@cs.oswego.edu> <5675D840.7090109@oracle.com> Message-ID: <56782DF0.20800@cs.oswego.edu> On 12/19/2015 05:20 PM, Brian Goetz wrote: >> Right. I agree that the signature must compatibly change, >> but not necessarily that anything else does. > > OK, we're on the same page. Where I think this page says that interfaces with existing methods accepting Object args can in principle be anyfied without strictly requiring (but usually strongly encouraging) implementation class rework. There is a way to enable/translate analogs of Object methods, in particular equals(). We wouldn't normally recommend blanket anyfication of interfaces, but Collections is the main one that everyone hopes will be somehow doable. The full story on this has a few more quirks though. It gets uncomfortable to cope with synchronized(obj), Object.wait, and Object.notify: Semantically, synchronized(val) and notify would be no-ops, and wait would block forever. Which would be OK, because no sensible general-purpose implementation of say, collection.contains would use any of these. And, as John almost noted, compareTo/Comparable needs treatment similar to my hybrid version of equals. -Doug From brian.goetz at oracle.com Mon Dec 21 17:13:36 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 21 Dec 2015 12:13:36 -0500 Subject: Equality In-Reply-To: <56782DF0.20800@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> <56746AE2.9070403@cs.oswego.edu> <9FA0A474-0CE4-4F13-BE46-5C4515D2CD94@oracle.com> <56757F8D.2030304@cs.oswego.edu> <5675880D.4060503@oracle.com> <5675A5D5.6060208@cs.oswego.edu> <5675D840.7090109@oracle.com> <56782DF0.20800@cs.oswego.edu> Message-ID: <56783340.2010309@oracle.com> > Where I think this page says that interfaces with existing methods > accepting Object args can in principle be anyfied without strictly > requiring (but usually strongly encouraging) implementation class rework. I want to draw a strict distinction between "migrating APIs" and "migrating implementations." The messages so far have been deliberately restricted to migrating APIs, the linguistic tools available for doing so, and their impact on the Collection APIs. > There is a way to enable/translate analogs of Object methods, > in particular equals(). I was hoping to treat the Object methods in a separate discussion. It does connect to this discussion in that contains() will inevitably appeal to equals(), but it also pulls in a zillion other things (are values objects? can value implement interfaces? is there a base type for values? what is the relationship between the base type for values and for objects? is there a top type? etc.) I think the upshot here is that the problem of "what is the signature of equals" is essentially the same as for contains/remove (contains inevitably calls equals), so we should be mindful that whatever discussions we have for collections are also going to impact equals, and ideally the same hammer pounds down both nails. > We wouldn't normally recommend blanket anyfication of interfaces, > but Collections is the main one that everyone hopes will be somehow > doable. Collections is important both because its fundamental, and because its the canary -- if we can't anyfy Collections, there's a good chance that are tools are still insufficient for migrating other real-world libraries. > The full story on this has a few more quirks though. This is what I was getting at above, with "let's treat the implementation part of the problem separately." There are a pile of idioms that show up in this kind of code whose semantics gets fuzzy when a type variable straddles references and values -- comparison to null, comparison to other objects (particularly 'this', which shows up in AbstractList.toString), synchronization, assignment to null, instanceof/cast, array creation.) My initial porting exercise of Collections leads me to conclude that the tools needed for migrating the APIs and the tools needed for migrating the code are mostly decoupled. Since the API changes are more visible, I thought it sensible to start there. > It gets uncomfortable to cope with synchronized(obj), > Object.wait, and Object.notify: Semantically, synchronized(val) > and notify would be no-ops, and wait would block forever. This is one approach (the "permanently locked" object approach that John described in an earlier Value Objects proposal), but there are others. Let's come back to this. From ron at paralleluniverse.co Mon Dec 21 17:18:18 2015 From: ron at paralleluniverse.co (Ron Pressler) Date: Mon, 21 Dec 2015 09:18:18 -0800 (PST) Subject: Migrating methods in Collections In-Reply-To: <567826D3.7050504@oracle.com> References: <567826D3.7050504@oracle.com> Message-ID: <6C85E814-EE54-4DAF-92F8-650D01E21D18@paralleluniverse.co> > On Dec 21, 2015, at 6:20 PM, Brian Goetz wrote: > > >> >> >> 2. If methods are to be removed (as in made partial), instead of >> magically disappearing them at the call site based on usage, perhaps we >> should consider hiding them by source-code version (not from the class >> file, of course, only hiding them in javac)? This is an explicit >> decision to break source compatibility, but it has two mitigating >> factors: 1/ javac conveniently has a source level (which, I hear, will >> also result in hiding new methods starting with Java 9) and 2/ Java >> already breaks source compatibility from time to time. I had quite a few >> classes that didn?t compile under 8 because 8 changed the name >> resolution rules wrt static imports (or, more precisely, made them >> conform to the JLS, whereas they hadn't in prior versions). It took me >> some time to figure out what was wrong, but hidden methods would be able >> to give much better error messages. >> > > > > > I'm not sure I'm following what problem you're trying to solve here?? > (This sounds a little like the tricks we did with default methods when? > compiling with the jdk8 compiler in -source 7 mode, where we didn't? > consider default methods to be members of the class for some purposes? > when viewed from 7 code?) Can you elaborate? > Sure. Instead of demoting, say, remove(int) to a partial method, simply hide it from all source level 10 code, which will only be able to access removeAt, even on a List (the method will still be in the class, of course). Cons: breaks source compatibility (but not binary compatibility) in a more major way than ever before, but Java has mechanisms to deal with that (source level), and automatic migration tools should be easy. Pros: less strange than partial methods; simpler to implement; a more general (albeit crude) migration mechanism, or, rather binary-compatible source-deprecation mechanism. Now, it is a dramatic break, but Valhalla is quite dramatic anyway. Partial methods are a migration measure (we wouldn?t have needed them had the APIs been designed with values in mind, right?) but they?ll stay a part of the language forever, and they don?t have the general usefulness of default methods (unless there are non-migration reasons to make use of partial methods that make sense in Java). > Yeah, this is what I meant by "Even though this works, its still not? > that obvious."? > > > inference concludes U=Animal, so everything is fine. > Well, it?s obvious now? :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Dec 21 18:03:42 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 21 Dec 2015 13:03:42 -0500 Subject: Migrating methods in Collections In-Reply-To: <6C85E814-EE54-4DAF-92F8-650D01E21D18@paralleluniverse.co> References: <567826D3.7050504@oracle.com> <6C85E814-EE54-4DAF-92F8-650D01E21D18@paralleluniverse.co> Message-ID: <56783EFE.2070004@oracle.com> > Instead of demoting, say, remove(int) to a partial method, simply > hide it from all source level 10 code, which will only be able to access > removeAt, even on a List (the method will still be in the class, > of course). Cons: breaks source compatibility (but not binary > compatibility) in a more major way than ever before, but Java has > mechanisms to deal with that (source level), and automatic migration > tools should be easy. Pros: less strange than partial methods; simpler > to implement; a more general (albeit crude) migration mechanism, or, > rather binary-compatible source-deprecation mechanism. I think people would be pretty ticked off if Map.get() just went away. I think the response would be: "Those idiots decided to change their libraries for their own reasons, I have no intention of ever specializing my Map, and yet I have to change my code anyway." Secondarily, while we might plan to do this to Collections in version N, other generic libraries (including other JDK libraries) might wait until N+3 to anyfy their own. Realistically this means we'd be forced to expose whatever versioning mechanism we use here for general use -- which seems at least as potentially confusing (and open to misuse) as partial methods. While a method-grained versioning mechanism seems like it might solve a lot of problems (for example, we wouldn't have needed to do default methods), so far, we've not seen any satisfactory theory that we'd want to consider building on -- there have been many attempts in the academic literature but I think method versioning in object oriented systems is still an unsolved problem. So I'm wary this could degenerate into something far worse than partial methods -- a bad versioning system. Separately, I think the distaste for partial methods may also be a little bit an allergic reaction to the deliberately-bad syntax we're using. I'll share a caricature of a past interaction on this topic (with someone on this list, actually) that illustrates the power of implicit syntactic biases: Him: This where-clause thing is totally confusing and will be completely foreign to Java developers! Augh! Me: What if I wrote it like this instead: boolean remove(Collection this, int index) Is that less confusing? (oh, and BTW this builds on the *existing Java 8 syntax* that is already there for explicit receiver parameters, which we added so they can be annotated.) Him: That's so much better! Then its clear that the restriction is just part of the method signature. And if there is more than one partial method called foo(), its clear from this that they are distinct overloads. Now, I don't want to devolve into premature syntax bikeshedding, but my point is: I don't think the it is the concept that is fundamentally confusing, its just that we will (in addition to convincing ourselves that the model is sound, which is the task currently in front of us) then additionally have to fit it into a syntactic expression that makes sense to Java users. (Coming up with a good syntactic form is also hard, so I want to first ensure we have a sound theoretical model before taking on unnecessary additional work.) > Now, it is a dramatic break, but Valhalla is quite dramatic anyway. > Partial methods are a migration measure (we wouldn?t have needed them > had the APIs been designed with values in mind, right?) but they?ll stay > a part of the language forever, and they don?t have the general > usefulness of default methods (unless there are non-migration reasons to > make use of partial methods that make sense in Java). Mostly, but not entirely. Partial methods also allow you to do this: interface List { default long sum() { ... } } which is not strictly related to migration. (Personally, I don't love this as a feature, because it's weaker than it first appears (think: "The Expression Problem"), and when you try to shore up these weaknesses with a more powerful slicing mechanism like it starts to get more complex -- but this form of partial method is also part of the current best approach we've got for being able to replace IntStream with Stream, which is easier in some ways, and harder in others, than Collections.) However (picking up the above syntactic form), would you find these signatures terribly confusing? interface Stream { ... int sum(Stream this); long sum(Stream this); double sum(Stream this); } From ron at paralleluniverse.co Mon Dec 21 20:50:21 2015 From: ron at paralleluniverse.co (Ron Pressler) Date: Mon, 21 Dec 2015 12:50:21 -0800 (PST) Subject: Migrating methods in Collections In-Reply-To: <56783EFE.2070004@oracle.com> References: <567826D3.7050504@oracle.com> <6C85E814-EE54-4DAF-92F8-650D01E21D18@paralleluniverse.co> <56783EFE.2070004@oracle.com> Message-ID: <1164377D-E173-44EB-97D7-60FF1FE1EA77@paralleluniverse.co> > On Dec 21, 2015, at 8:03 PM, Brian Goetz wrote: > > I think people would be pretty ticked off if Map.get() just went away.? > I think the response would be: "Those idiots decided to change their? > libraries for their own reasons, I have no intention of ever? > specializing my Map, and yet I have to change my code anyway." > But those same people would also consume an API which returns a List and then find those same methods gone anyway.? Although my suggestion would probably require a tool (a javac plugin?) to migrate sources, though. Go has ?go fix?, but Java has had such a tool for a long time (I think James Gosling worked on it). I know I?m making my suggestion sound even scarier, but I think it beats adding a type system trick for the purpose of source compatibility (more on that later). > Secondarily, while we might plan to do this to Collections in version N,? > other generic libraries (including other JDK libraries) might wait until? > N+3 to anyfy their own. Realistically this means we'd be forced to? > expose whatever versioning mechanism we use here for general use --? > which seems at least as potentially confusing (and open to misuse) as? > partial methods. > I don?t know if it would be as confusing (I don?t think it would), but it may possibly also be used to solve the 64-bit index problem.? > While a method-grained versioning mechanism seems like? > it might solve a lot of problems (for example, we wouldn't have needed? > to do default methods), so far, we've not seen any satisfactory theory? > that we'd want to consider building on -- there have been many attempts? > in the academic literature but I think method versioning in object? > oriented systems is still an unsolved problem. So I'm wary this could? > degenerate into something far worse than partial methods -- a bad? > versioning system. > Default methods were also necessary for binary compatibility. I?m talking of something much simpler (like @AvailableUpTo(9)). > Now, I don't want to devolve into premature syntax bikeshedding, but my? > point is: I don't think the it is the concept that is fundamentally? > confusing, its just that we will (in addition to convincing ourselves? > that the model is sound, which is the task currently in front of us)? > then additionally have to fit it into a syntactic expression that makes? > sense to Java users. (Coming up with a good syntactic form is also? > hard, so I want to first ensure we have a sound theoretical model before? > taking on unnecessary additional work.) > Maybe. But partial methods are a clever deprecation mechanism that?s built into the type system. Not that I categorically oppose type-system cleverness (superation seems great), but source compatibility ? which Java doesn?t always preserve and has a good mechanism to manage anyway ? doesn?t seem like a good enough reason. >> > > > > Partial methods also allow you to do this: > > > interface List { > default long sum() { ... } > } > > > which is not strictly related to migration. (Personally, I don't love? > this as a feature, because it's weaker than it first appears (think:? > "The Expression Problem"), and when you try to shore up these weaknesses? > with a more powerful slicing mechanism like it? > starts to get more complex > Exactly.? > -- but this form of partial method is also? > part of the current best approach we've got for being able to replace? > IntStream with Stream, which is easier in some ways, and harder in? > others, than Collections.) > Is the goal to somehow make IntStream into Stream or to deprecate IntStream? If the latter, I also see no reason why sum (but not other sensible operations) must be part of Stream. In any event, a more general solution would be extension methods (I am not proposing we add those).? > > > However (picking up the above syntactic form), would you find these? > signatures terribly confusing? > > > interface Stream { > ... > int sum(Stream this); > long sum(Stream this); > double sum(Stream this); > } > What about other numeric types? Maybe? ? ?BigInteger sum (Stream this) too? And what if users would be able to add their own numeric value types? It?s a weird way to add what are essentially extension methods, and on the wrong side of the expression problem as you noted. If, OTOH, we?d have a ?numeric? interface on value types and integers (as I think John alluded to), that might make things better. Also, it?s not so much a question of confusion as of ?does it fit with the feel of Java?? Ron -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Mon Dec 21 22:33:57 2015 From: john.r.rose at oracle.com (John Rose) Date: Mon, 21 Dec 2015 14:33:57 -0800 Subject: Equality In-Reply-To: <56782DF0.20800@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> <56746AE2.9070403@cs.oswego.edu> <9FA0A474-0CE4-4F13-BE46-5C4515D2CD94@oracle.com> <56757F8D.2030304@cs.oswego.edu> <5675880D.4060503@oracle.com> <5675A5D5.6060208@cs.oswego.edu> <5675D840.7090109@oracle.com> <56782DF0.20800@cs.oswego.edu> Message-ID: <5FCA6B8E-57DC-4F7A-8361-FF1D0A50F4FD@oracle.com> On Dec 21, 2015, at 8:50 AM, Doug Lea
wrote: > > On 12/19/2015 05:20 PM, Brian Goetz wrote: > >>> Right. I agree that the signature must compatibly change, >>> but not necessarily that anything else does. >> >> OK, we're on the same page. > > Where I think this page says that interfaces with existing methods > accepting Object args can in principle be anyfied without strictly > requiring (but usually strongly encouraging) implementation class rework. > There is a way to enable/translate analogs of Object methods, > in particular equals(). I think we can do this. There seem to be more than enough tools at our disposal: Ability to define static intrinsics as needed, freedom to factor tricky code into default methods on interfaces or shared supers (Object, ValueType), reflection (as a last resort), right to appeal to lifted value semantics on auto-boxes, close correspondence between generic any-fied code and specialized code. (It would be inefficient at this point to work out all the implementation details over email, as Brian notes. That work is best done by prototyping. So I'm holding back for now.) > We wouldn't normally recommend blanket anyfication of interfaces, > but Collections is the main one that everyone hopes will be somehow > doable. Ability to retrofit is a big goal for us, of course. > The full story on this has a few more quirks though. > It gets uncomfortable to cope with synchronized(obj), > Object.wait, and Object.notify: Semantically, synchronized(val) > and notify would be no-ops, and wait would block forever. > Which would be OK, because no sensible general-purpose implementation > of say, collection.contains would use any of these. Yes. Those semantics are appropriate for objects which are immutable, for which the "write lock" will never become available. > And, as John almost noted, compareTo/Comparable needs > treatment similar to my hybrid version of equals. Two events like that certainly call for generalization. Algebraists will be eager to suggest other any-fied relations, so we want to support open-ended extension mechanisms. This is one reason value types are envisioned to interoperate with interfaces. On Dec 21, 2015, at 9:13 AM, Brian Goetz wrote: > > This is what I was getting at above, with "let's treat the implementation part of the problem separately." There are a pile of idioms that show up in this kind of code whose semantics gets fuzzy when a type variable straddles references and values -- comparison to null, comparison to other objects (particularly 'this', which shows up in AbstractList.toString), synchronization, assignment to null, instanceof/cast, array creation.) Add reflection to that list. Also auto-boxing (which happens when you cross over to Object). As I said above, I'm very optimistic that we can shape the details of these things so that they are quite useful. For me the most important guiding principle is lifting all value semantics to value boxes, while allowing those boxes to be "heisenboxes" (value-based, aggressively identity-agnostic). Those rules usually assign workable semantics for reference-like operations on values ("as if boxed"). The actual physical cost of boxing can be waved away, either by saying "the JIT can optimize it" or (more aggressively) by lowering the semantics into the value bytecodes. In both cases, the source code looks the same, as if there is autoboxing happening wherever needed, but the user doesn't need to care where. > My initial porting exercise of Collections leads me to conclude that the tools needed for migrating the APIs and the tools needed for migrating the code are mostly decoupled. Since the API changes are more visible, I thought it sensible to start there. > > > It gets uncomfortable to cope with synchronized(obj), > > Object.wait, and Object.notify: Semantically, synchronized(val) > > and notify would be no-ops, and wait would block forever. > > This is one approach (the "permanently locked" object approach that John described in an earlier Value Objects proposal), but there are others. Let's come back to this. (The perma-lock semantics works nicely for frozen arrays. You want to be able to lock a frozen array very quickly in order to read it safely, if you are processing a mix of frozen and under-lock-mutable arrays. But a fail-fast semantics is friendlier for other use cases, where we want to deprecate collections that stupidly lock on their elements. This deserves another thread. My important point right now is optimism: We seem to have more than enough tactics to create a decent design.) ? John From john.r.rose at oracle.com Mon Dec 21 23:16:08 2015 From: john.r.rose at oracle.com (John Rose) Date: Mon, 21 Dec 2015 15:16:08 -0800 Subject: 64 bit collections, and API migration in general (was Re: Migrating methods in Collections) In-Reply-To: <56743A70.1000401@oracle.com> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A70.1000401@oracle.com> Message-ID: <9F488F38-7803-48D6-B663-2D18BDFDB42E@oracle.com> On Dec 18, 2015, at 8:55 AM, Brian Goetz wrote: > >> 2. It seems irresponsible to spend so much effort on >> Collections without also somehow addressing 32bit size/index >> limitations. Yes? Yes it would be, and we are aware of this question, and expecting to address it later in the light of prior design choices. > I think that's really a separate question. While everything said so far is Collection-specific, it's not really "so much effort on Collections" as much as "so much effort to ensure that legacy libraries can be compatibly anyfied", and that Collections is the poster child for that effort. (If we can't migrate Collections, that's evidence that we're still lacking in linguistic tools for supporting the transition to anyfied generics.) > > So I'll interpret your question as: "These are nice migration tools for migrating erased libraries to anyfied, but there are other migrations we'd like to perform on these aging libraries, please don't forget about them?" > > The migration in question is whether we can compatibly migrate methods like: > > get(int index) > to > get(long index) "Two instances" is always a suspected hiding place for "more than one instance". For example, an APL-like matrix can be viewed as a linear (ravel-able) collection indexed by int-pairs, which are neither ints or longs. I am assuming (until proven wrong) that the move from int to long should be part of a larger move from int to Index, where Index is an any-fied generic parameter. The practical effect of this concern, at the present moment, is that we should not only look at APIs which mention "Object" as suspects for any-fication, but also APIs which use "int" as an index. An "int" index is a candidate for replacement with some type "Index". It is *not* (IMO) likely to be a good candidate for ad hoc introduction of sibling "long" overloadings. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Dec 21 23:55:06 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 21 Dec 2015 18:55:06 -0500 Subject: Migrating methods in Collections In-Reply-To: <1164377D-E173-44EB-97D7-60FF1FE1EA77@paralleluniverse.co> References: <567826D3.7050504@oracle.com> <6C85E814-EE54-4DAF-92F8-650D01E21D18@paralleluniverse.co> <56783EFE.2070004@oracle.com> <1164377D-E173-44EB-97D7-60FF1FE1EA77@paralleluniverse.co> Message-ID: >> On Dec 21, 2015, at 8:03 PM, Brian Goetz wrote: >> I think people would be pretty ticked off if Map.get() just went away. >> I think the response would be: "Those idiots decided to change their >> libraries for their own reasons, I have no intention of ever >> specializing my Map, and yet I have to change my code anyway." > > But those same people would also consume an API which returns a List and then find those same methods gone anyway. Theres a big difference, though. The difference is, there is no code today that uses List. So no existing code will break. Because anyfying a class or a client is an explicit operation (on the class side, you?re explicitly making a tvar ?any?, on the client side, you?re using a List which by definition didn?t work before.) Breaking existing code is far worse than ?If you want to upgrade your List to be List, you?ll have to adjust a few other things too.? As a general rule, the pain of migrating should go to those who want to migrate, and should not fall on those who don?t want to migrate. So existing Map code that uses reference types and wants to keep using reference types, should be able to completely ignore the changes to the API. > Although my suggestion would probably require a tool (a javac plugin?) to migrate sources, though. Go has ?go fix?, but Java has had such a tool for a long time (I think James Gosling worked on it). I know I?m making my suggestion sound even scarier, but I think it beats adding a type system trick for the purpose of source compatibility (more on that later). I have hopes that the IDEs will provide some sort of ?migrate to new collections? transform. The goal there is to reduce the pain of ?I wanted to migrate to use List?, but even if this were a one-button thing, I am not sure I?d want to impose it on people who have existing codebases that they have no plans of upgrading to values. Another mitigating factor is that the new methods are total. That means, you can migrate your code to be ?any-collections-ready? without actually using any of the anyfied classes, and without changing the semantics of anything. Which gives us a path to eventually deprecating the old methods ? though realistically it would probably be a VERY long time before we removed them. > Is the goal to somehow make IntStream into Stream or to deprecate IntStream? If the latter, I also see no reason why sum (but not other sensible operations) must be part of Stream. In any event, a more general solution would be extension methods (I am not proposing we add those). Something slightly more ambitious. I?d like to deprecate {Int,Long,Double}Stream, but allow Stream to respond to all methods currently supported by IntStream. This provides a path to getting rid of the manual specializations (probably faster than the legacy collection methods) because Stream would be just as good as the old IntStream. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Dec 22 00:02:14 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 21 Dec 2015 19:02:14 -0500 Subject: Migrating methods in Collections In-Reply-To: <1164377D-E173-44EB-97D7-60FF1FE1EA77@paralleluniverse.co> References: <567826D3.7050504@oracle.com> <6C85E814-EE54-4DAF-92F8-650D01E21D18@paralleluniverse.co> <56783EFE.2070004@oracle.com> <1164377D-E173-44EB-97D7-60FF1FE1EA77@paralleluniverse.co> Message-ID: <1F3C7D5B-19A6-46D3-AF8E-FE166EA94A15@oracle.com> > What about other numeric types? Maybe > > BigInteger sum (Stream this) > > too? And what if users would be able to add their own numeric value types? It?s a weird way to add what are essentially extension methods, and on the wrong side of the expression problem as you noted. If, OTOH, we?d have a ?numeric? interface on value types and integers (as I think John alluded to), that might make things better. BTW, This is exactly what I meant by ?trying to make the feature more powerful by adding new axes of slicing.? Its possible, but it has tradeoffs ? most specifically, it pushes the obligation to do ?most specific? testing to runtime, especially when multiple type variables are involved. Its not out of the question, but I?d like to start with the default position that a receiver selector type for a partial method should be a reifiable runtime type (e.g., Foo, Foo, but not Foo) ? this is a stable position that supports the must-have use cases and also has a clear and simple runtime implementation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron at paralleluniverse.co Tue Dec 22 09:36:24 2015 From: ron at paralleluniverse.co (Ron Pressler) Date: Tue, 22 Dec 2015 01:36:24 -0800 (PST) Subject: Migrating methods in Collections In-Reply-To: References: <567826D3.7050504@oracle.com> <6C85E814-EE54-4DAF-92F8-650D01E21D18@paralleluniverse.co> <56783EFE.2070004@oracle.com> <1164377D-E173-44EB-97D7-60FF1FE1EA77@paralleluniverse.co> Message-ID: <460CA192-08FB-4084-A555-7F005025F13E@paralleluniverse.co> > > On Dec 22, 2015, at 1:55 AM, Brian Goetz wrote: > > > As a general rule, the pain of migrating should go to those who want to migrate, and should not fall on those who don?t want to migrate. So existing Map code that uses reference types and wants to keep using reference types, should be able to completely ignore the changes to the API.? > > > Another mitigating factor is that the new methods are total. That means, you can migrate your code to be ?any-collections-ready? without actually using any of the anyfied classes, and without changing the semantics of anything. Which gives us a path to eventually deprecating the old methods ? though realistically it would probably be a VERY long time before we removed them.? > > > I agree with everything, but it all comes down to the following question: does backwards _source_ compatibility alone, something that Java has good coping mechanisms for (javac source level, IDE/tool support) justify the addition of a feature that is not trivial and not very general (certainly not as general as extension methods)? We?re talking about receiver type-matching that is finer-grained than a class, something that feels foreign (and ?un-simple") in OOP. > Something slightly more ambitious. I?d like to deprecate {Int,Long,Double}Stream, but allow Stream to respond to all methods currently supported by IntStream. This provides a path to getting rid of the manual specializations (probably faster than the legacy collection methods) because Stream would be just as good as the old IntStream.? > > > But couldn?t it be just as good a replacement even if some of the methods were plain static methods, something Java developers are quite familiar with? It will require code migration either way. Yes, it won?t have the same fluent-API, but neither will other methods that people will come up with. Is hand-specializing the _public interface_ (I have no qualms with hand-specializing hidden implementation) a necessary enough feature to justify non-class-based receiver-type-matching? It feels like a new and unfamiliar form of ad-hoc almost-but-not-quite extension methods (sadly, actual extension methods won?t solve the problem). If anything, backwards source compatibility is a stronger argument, as it is very important (though, IMO, not important enough to justify this). Default methods had both the urgency ? binary compatibility ? and the generality. It seems to me that partial methods have neither. I?m not saying they?re not a cool feature or that they don?t solve the problem, but they don?t feel very blue-collar. Anyway, I?ve said my piece on this matter :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron at paralleluniverse.co Tue Dec 22 11:58:34 2015 From: ron at paralleluniverse.co (Ron Pressler) Date: Tue, 22 Dec 2015 03:58:34 -0800 (PST) Subject: Migrating methods in Collections In-Reply-To: <460CA192-08FB-4084-A555-7F005025F13E@paralleluniverse.co> References: <567826D3.7050504@oracle.com> <6C85E814-EE54-4DAF-92F8-650D01E21D18@paralleluniverse.co> <56783EFE.2070004@oracle.com> <1164377D-E173-44EB-97D7-60FF1FE1EA77@paralleluniverse.co> <460CA192-08FB-4084-A555-7F005025F13E@paralleluniverse.co> Message-ID: P.S. 1. While language-level extension methods (no JVM involvement) won?t solve the deprecated-method problem, they do solve the Stream.sum problem, and quite elegantly. That is an orthogonal feature which could be added in a later version, and in a compatible way (static methods could be turned into extension methods) with a simple new static-import statement. Instead of one not-too-general and a little strange solution, we can have two general and completely orthogonal solutions, which are also probably simpler to implement, and, IMO easier to understand. Yes, there?s a big price to pay, but it is stacked against a big chunk of complexity which may be avoided. 2. The method-hiding solution that I suggest might also serve as a nice part of the solution to the 64-bit (or John?s generalized) indexing problem. The int-indexed methods will simply be hidden from Java 10 code. Ron > On Dec 22, 2015, at 11:36 AM, Ron Pressler wrote: > > >> >> On Dec 22, 2015, at 1:55 AM, Brian Goetz wrote: >> >> >> As a general rule, the pain of migrating should go to those who want to migrate, and should not fall on those who don?t want to migrate. So existing Map code that uses reference types and wants to keep using reference types, should be able to completely ignore the changes to the API. >> >> > >> Another mitigating factor is that the new methods are total. That means, you can migrate your code to be ?any-collections-ready? without actually using any of the anyfied classes, and without changing the semantics of anything. Which gives us a path to eventually deprecating the old methods ? though realistically it would probably be a VERY long time before we removed them. >> >> >> > > I agree with everything, but it all comes down to the following question: does backwards _source_ compatibility alone, something that Java has good coping mechanisms for (javac source level, IDE/tool support) justify the addition of a feature that is not trivial and not very general (certainly not as general as extension methods)? We?re talking about receiver type-matching that is finer-grained than a class, something that feels foreign (and ?un-simple") in OOP. > > > >> Something slightly more ambitious. I?d like to deprecate {Int,Long,Double}Stream, but allow Stream to respond to all methods currently supported by IntStream. This provides a path to getting rid of the manual specializations (probably faster than the legacy collection methods) because Stream would be just as good as the old IntStream. >> >> >> > > But couldn?t it be just as good a replacement even if some of the methods were plain static methods, something Java developers are quite familiar with? It will require code migration either way. Yes, it won?t have the same fluent-API, but neither will other methods that people will come up with. Is hand-specializing the _public interface_ (I have no qualms with hand-specializing hidden implementation) a necessary enough feature to justify non-class-based receiver-type-matching? It feels like a new and unfamiliar form of ad-hoc almost-but-not-quite extension methods (sadly, actual extension methods won?t solve the problem). If anything, backwards source compatibility is a stronger argument, as it is very important (though, IMO, not important enough to justify this). > > > Default methods had both the urgency ? binary compatibility ? and the generality. It seems to me that partial methods have neither. I?m not saying they?re not a cool feature or that they don?t solve the problem, but they don?t feel very blue-collar. > > > Anyway, I?ve said my piece on this matter :) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dl at cs.oswego.edu Tue Dec 22 14:01:06 2015 From: dl at cs.oswego.edu (Doug Lea) Date: Tue, 22 Dec 2015 09:01:06 -0500 Subject: Equality In-Reply-To: <5FCA6B8E-57DC-4F7A-8361-FF1D0A50F4FD@oracle.com> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> <56746AE2.9070403@cs.oswego.edu> <9FA0A474-0CE4-4F13-BE46-5C4515D2CD94@oracle.com> <56757F8D.2030304@cs.oswego.edu> <5675880D.4060503@oracle.com> <5675A5D5.6060208@cs.oswego.edu> <5675D840.7090109@oracle.com> <56782DF0.20800@cs.oswego.edu> <5FCA6B8E-57DC-4F7A-8361-FF1D0A50F4FD@oracle.com> Message-ID: <567957A2.4090900@cs.oswego.edu> On 12/21/2015 05:33 PM, John Rose wrote: > Two events like that certainly call for generalization. > Algebraists will be eager to suggest other any-fied relations, > so we want to support open-ended extension mechanisms. > This is one reason value types are envisioned to interoperate > with interfaces. > I'm not sure about generalization. There's only the beginnings of academic work on static analysis and validation of functional properties (like those below pasted from something else I had around). In the mean time, probably the best we could do is add annotations (like @Symmetric) that would have to be trusted in order to be effective. Or, nearer term, focus only on equals and compareTo. for function f, predicate p, and valid arguments a, b, c: Idempotent: f(a) == f(a) Deterministic: if (a == b) then f(a) == f(b) Injective: if (a != b) then f(a) != f(b) Commutative: f(a, b) == f(b, a) Associative: f(f(a, b), c) == f(a, f(b, c)) Monotonic: if (a <= b) then f(a) <= f(b) Reflexive: p(a, a) Irreflexive: !p(a, a) Symmetric: if (a == b) then p(a, b) == p(b, a) Antisymmetric: if (a != b) then p(a, b) != p(b, a) Transitive: if (p(a, b) and p(b, c)) then p(a, c) -Doug From dl at cs.oswego.edu Tue Dec 22 14:18:41 2015 From: dl at cs.oswego.edu (Doug Lea) Date: Tue, 22 Dec 2015 09:18:41 -0500 Subject: Equality In-Reply-To: <5FCA6B8E-57DC-4F7A-8361-FF1D0A50F4FD@oracle.com> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> <56746AE2.9070403@cs.oswego.edu> <9FA0A474-0CE4-4F13-BE46-5C4515D2CD94@oracle.com> <56757F8D.2030304@cs.oswego.edu> <5675880D.4060503@oracle.com> <5675A5D5.6060208@cs.oswego.edu> <5675D840.7090109@oracle.com> <56782DF0.20800@cs.oswego.edu> <5FCA6B8E-57DC-4F7A-8361-FF1D0A50F4FD@oracle.com> Message-ID: <56795BC1.8000307@cs.oswego.edu> I should have noted... On 12/21/2015 05:33 PM, John Rose wrote: > On Dec 21, 2015, at 8:50 AM, Doug Lea
wrote: >> And, as John almost noted, compareTo/Comparable needs >> treatment similar to my hybrid version of equals. > Well, "needs" is too strong. It would be disappointing if values could not be Comparable, but all java.util APIs related to sorted-orders allow a separate Comparator, so would still be usable. -Doug From brian.goetz at oracle.com Tue Dec 22 18:30:56 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 22 Dec 2015 13:30:56 -0500 Subject: Migrating methods in Collections In-Reply-To: References: <567826D3.7050504@oracle.com> <6C85E814-EE54-4DAF-92F8-650D01E21D18@paralleluniverse.co> <56783EFE.2070004@oracle.com> <1164377D-E173-44EB-97D7-60FF1FE1EA77@paralleluniverse.co> <460CA192-08FB-4084-A555-7F005025F13E@paralleluniverse.co> Message-ID: <567996E0.5090607@oracle.com> > 1. While language-level extension methods (no JVM involvement) won?t > solve the deprecated-method problem, they do solve the Stream.sum > problem, and quite elegantly. Sort of -- but only if you buy the partiality. In order for this to work, you have to (as .NET does) inject methods into *types*, not into *classes*. This is just a partial method in disguise! In order for this to work, you have to be comfortable with the idea that the set of methods on a given receiver is dependent on the parameterization of the receiver. And you clearly are, since you don't think extension methods are too complex. FTR, we considered and rejected use-site extension methods in 8, for a philosophical reason that is still equally valid here: an API developers should own their API. What we rejected is the use-site aspect of it; the part we actually liked (but didn't have enough motivation to embrace in 8) was the partiality. Just as default methods support the after-the-fact aspect of API extension that use-site extension would (without the transparency risks), partial methods (including partial defaults) support the specialized-receiver aspect of extension methods that we like but don't yet have. From ron at paralleluniverse.co Tue Dec 22 19:09:18 2015 From: ron at paralleluniverse.co (Ron Pressler) Date: Tue, 22 Dec 2015 11:09:18 -0800 (PST) Subject: Migrating methods in Collections In-Reply-To: <567996E0.5090607@oracle.com> References: <567826D3.7050504@oracle.com> <6C85E814-EE54-4DAF-92F8-650D01E21D18@paralleluniverse.co> <56783EFE.2070004@oracle.com> <1164377D-E173-44EB-97D7-60FF1FE1EA77@paralleluniverse.co> <460CA192-08FB-4084-A555-7F005025F13E@paralleluniverse.co> <567996E0.5090607@oracle.com> Message-ID: > FTR, we considered and rejected use-site extension methods in 8, for a? > philosophical reason that is still equally valid here: an API developers? > should own their API. What we rejected is the use-site aspect of it;? > the part we actually liked (but didn't have enough motivation to embrace? > in 8) was the partiality. Just as default methods support the? > after-the-fact aspect of API extension that use-site extension would? > (without the transparency risks), partial methods (including partial? > defaults) support the specialized-receiver aspect of extension methods? > that we like but don't yet have. > > > Well, I?m not suggesting we add extension methods, only if fluent APIs are that important, and even then we don?t really need extension methods if we (rightly) want people to own their API, just a threading operator, a-la Clojure: ? ? stream => sum() ?(compiled to sum(stream)) BTW, for sum specifically to be fluent it requires neither approach. I think that ? ? stream.reduce(sum()) is just as good as? ? ? stream.sum() In fact, I prefer it, because it feels very general. But anyway, if partiality is something we want for its own sake, then partial methods FTW! :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Dec 22 20:01:20 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 22 Dec 2015 15:01:20 -0500 Subject: Migrating methods in Collections In-Reply-To: <5671BEC8.8010508@oracle.com> References: <56719E1F.2050504@oracle.com> <5671BEC8.8010508@oracle.com> Message-ID: <5679AC10.50109@oracle.com> Returning back to the short-term goal of migrating the Collections API... I realize that a certain degree of "tell me the story again for X?" is needed to meaningfully respond, but I'd like to bring the focus back to the core APIs (and secondarily the impact on anyfying generic APIs in general.) I think Map.get offers us an existence proof that *some* degree of API evolution is needed here. So I don't think there's a credible "do nothing" approach. Still, minimizing the impact seems a reasonable goal. We have a set of credible tools for evolving the APIs -- migrating ref-specific methods to more suitable total alternatives (remove -> removeAt), abandoning some methods to "reference purgatory" where there is already a suitable total alternative (e.g., abandoning removeAll in favor of the existing removeIf added in 8), and some form of "superation". Plenty of additional work is needed to find a way to express superation in way that's not painfully ugly, but we can address that after the concept has proven itself. So, I propose we return to the topic of Collection APIs. In addition to making collections safe for values, we also have some limited opportunity to fix errors in the API, within reason, if there are any that rise to the appropriate threshold. (More intrusive migration -- such as busting the 32-bit index limitations -- is likely to be possible with additional linguistic tools, but I'd like to save that for another round.) Further, let's focus (for now) on the perspective of clients, not implementors. Under any of the proposed changes: - existing binary clients will continue to work unchanged - existing sources can be recompiled and will work without change - if a client is anyfied (either by anyfying a generic method, or by naming a value instantiation of a generic class), then its possible some method calls will no longer exist. However, since the client is changing the source, this is not a compatibility issue, it is only a migration burden (and one that can be mitigated to some degree by tooling.) On 12/16/2015 2:43 PM, Brian Goetz wrote: > The previous memo outlined a tactic for effectively "migrating" a > method in a current generic class to a related but different signature > in an any-generic class, while retaining source and binary > compatibility with existing clients and subclasses, and outlined the > list of possibly problematic methods in the Collections framework. > > The effectiveness of this tactic on this set of methods ranges from > "slam-dunk" to "seems like it could work" to "doesn't help." It would > be nice to have one hammer that pounds down all the nails, but I'm not > sure there is one. This memo outlines a range of complementary > tactics that might, in combination with the above approach, enable us > to cover the waterfront. > > I'll start with the method Collection.contains(Object). It might at > first seem that we want the signature here to be contains(E), but this > gets in the way of cases like: > > dogs.contains(animal) > > or of converting dogs::contains to a Predicate(which > is what filter/removeIf would want.) > > Note that this has little to do directly with value types; value types > are invariant. But if we want a single contains() method that ranges > over any E, it needs to accomodate variance for reference > instantiations while not falling back on Object as a top type. > > One technique for doing so would be to introduce contravariant > inference variables. Then we could write contains as: > > boolean contains(U u) > > This has three main downsides: > - Even though this works, its still not that obvious. > - It's a *lot* of work in the spec and compiler; it pushes on all the > fragilebits. > - If that weren't enough, it is a theoretical minefield. Papers like > http://www.cis.upenn.edu/~bcpierce/papers/variance.pdf show that > adding contravariance to certain type systems result in subtyping > becoming undecidable. > > On the other hand, I'm sure many library writers would jump for joy to > have this in the toolbox; the lack of contravariant tvars seems a > notable inconsistency in the language. (But let's not kid ourselves > about the costs.) > > It just so happens that this construct works out; it is binary- and > source- compatible to make a non-generic method generic, as long as > the erasure of the signature remains the same. So changing contains() > or remove() as above would not cause subclasses or clients to fail to > either link or recompile (in the subclass recompilation case, it would > be reinterpreted as a raw override, which is allowed and compatible.) > And we'd end up with a total method that does the right thing both for > refs and values. > > Ignoring the costs and risks, this technique applies to a number of > the methods in our rogue's gallery, including toArray(), for which we > didn't yet have a solution: > > U[] toArray() > > This is compatible with existing clients who are expecting an Object[] > to come back from toArray() on a collection of reference types, and > collapses to V[] for any value type, so Collection.toArray() > returns int[]. (We might still want an unchecked warning if the > compiler infers U != Object for reference E, but that's a separate and > easily handled consideration.) > > Let's call this technique "superation" (yes, its an intentional > (disgusting) pun. See > http://beta.merriam-webster.com/dictionary/suppurate. And think about > that the next time you pass a "Super 8" motel on the highway.) With > this in our toolbox, the strategy matrix becomes: > > > > *Class** > * *Method** > * *Possible Approaches** > * > Collection > contains(Object) > Superateto contains(U) > > remove(Object) > > > removeAll(Collection) > Abandon in favor of existing removeIf(Predicate). > > retainAll(Collection) > > > containsAll(Collection) > Migrate to containsElements(Collection), or abandon. > > toArray() > Superate to U[] toArray() > > toArray(T[]) > Leave as is, superate, or abandon. > List > remove(int) > Migrate to removeAt(int). > > indexOf(Object) > Migrate to Optional-bearing findFirst(Predicate) > > lastIndexOf(Object) > > Map > containsKey(Object) > Superate > > containsValue(Object) > Superate > > remove(Object) > Superate > > put(K,V) > Leave as is > > get(K) > Migrate to one (or all) of: > Optional map(K) > mapOrElse(K, V) > tryMap(K, Consumer) > > getOrDefault(Object, V) > Migrate to mapOrElse, or superate > Queue > poll(), peek() Migrate to tryPoll(Consumer) *or* optional-bearing method > Deque > poll(), peek() > > > pollFirst(), pollLast() > > > peekFirst(), peekLast() > > > removeFirstOccurrence(), removeLastOccurrence() > Migrate to predicate-accepting method, or superate > > > This is amore satisfying matrix; not only does everything have an > acceptable strategy, but some have more than one, and the user impact > of superating a method is lower (users might just not notice), so the > perception is that fewer methods are affected. Still, super-bounded > tvars are a big hammer for such a small foe. Maybe there's an > alternate approach that has the effect of superation but doesn't need > such a big hammer. > > > Here's one possibility. We already have a notion of partial methods. > We could have a pair of methods > > > Object[] toArray() > > > E[] toArray() > > both of which are reasonable signatures for their restricted domains. > Unfortunately, the natural interpretation of this pair of methods is > that the first is a member of Collection, and the second is a > member of Collection, but there is *no* toArray() method that > is a member of Collection! This means that code that is > generic in any-T would not see a toArray() method at all. That's a > problem (though not as enormous as it initially sounds, there are > possible mitigating techniques.) > > However, it is not unreasonable for the compiler to recognize this > situation and deal. Suppose I have some code generic in any-T: > > void foo(Collection c) { > T[] arr = c.toArray(); > } > > Now, the compiler doesn't know whether c is a collection of refs or > values, but it knows it's one or the other (ref T and val T form a > partition of any T). So it could (and in some cases, has to anyway) > do type checking by parts -- it can typecheck the above assignment > under the assumption of ref T, and do it again under the assumption of > val T, and if both succeed (and something else, see below), accept the > method invocation as valid. (In this case, the ref-T fork should > result in an unchecked warning, meaning that the merged checking also > yields an unchecked warning.) > > The "something else" part is: when doing overload resolution by parts, > both branches must resolve to overloads that are erasure-equivalent to > each other. Which is true for toArray() (and for all the cases for > which superation would work.) > > Now, this is a lot of handwaving, and it doesn't even really describe > how we think partial methods should actually work (I'd like to get rid > of the where-val-T slices entirely, this is a separate discussion.) > But its a sketch of an option that achieves the positive result of > superation without engaging the complexity of superation. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dl at cs.oswego.edu Wed Dec 23 14:10:45 2015 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 23 Dec 2015 09:10:45 -0500 Subject: Migrating methods in Collections In-Reply-To: <5679AC10.50109@oracle.com> References: <56719E1F.2050504@oracle.com> <5671BEC8.8010508@oracle.com> <5679AC10.50109@oracle.com> Message-ID: <567AAB65.6070508@cs.oswego.edu> On 12/22/2015 03:01 PM, Brian Goetz wrote: > Returning back to the short-term goal of migrating the Collections API... If methods accepting Object arguments can be anyfied, this removes some methods from your problem table: Collection.{contains(Object), remove(Object)}, List.{indexOf(Object), lastIndexOf(Object)} and Map.{containsKey(Object), containsValue(Object), remove(Object)}. I realize that there are still a bunch of unresolved issues in pulling this off. But ignoring them for now... One natural follow-on question is that if we can anyfy contains, why can't we do so for containsAll(Collection)? And similarly for removeAll(Collection), retainAll(Collection). In other words, is this or some variant allowed? boolean containsAll(Collection) If so, the main remaining questions surround optionality of results, that I'll answer separately. But there are still others, List.remove(int index) and Collection.toArray() Doing nothing about List.remove(index) seems to be legal option. No existing code will encounter an ambiguity that is not already present due to autoboxing (for List). New code using or implementing List will need some way to disambiguate. But I think that some syntax will be needed to allow anyway. It might be nice introduce method removeAt to reduce need to use this syntax, but doesn't seem necessary? About the two Collection toArray() methods: The no-arg version must return Object[]. I don't see how anyfying (in any way) can guarantee compatible results. The T[] toArray(T[] array) version has worse problems: most current implementations use reflection if the argument array is not big enough (because there is no syntax for "new T[n]"). I don't see offhand how to compatibly mangle reflective code. Plus, the spec explicitly says that if the array is too large, a null is appended to elements. Null is of course not a legal value for non-ref types. I don't see a good alternative to leaving both forms of toArray as-is, and to box results -- requiring that even custom non-ref implementations do so. But this suggests that we should find some other way (possibly in a utility class) to create a val-type array of elements in a val-type collection. -Doug From brian.goetz at oracle.com Wed Dec 23 15:52:25 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 23 Dec 2015 10:52:25 -0500 Subject: Migrating methods in Collections In-Reply-To: <567AAB65.6070508@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <5671BEC8.8010508@oracle.com> <5679AC10.50109@oracle.com> <567AAB65.6070508@cs.oswego.edu> Message-ID: <567AC339.3000805@oracle.com> Some good thoughts, and some wishful thinking. tl;dr Summary: - I think its a stretch to say that equals() and contains() can be anyfied while still accepting Object. I think there are linguistic solutions so that existing Object-accepting code can continue to run unchanged for reference instantiations, and that the signatures can be generally rescued, but I think that's something different than "methods accepting Objects can be anyfied." - toArray() is indeed a problem. I believe that the same tools for rescuing equals() can also probably be applied towards toArray(). > If methods accepting Object arguments can be anyfied, this removes > some methods from your problem table: Collection.{contains(Object), > remove(Object)}, List.{indexOf(Object), lastIndexOf(Object)} and > Map.{containsKey(Object), containsValue(Object), remove(Object)}. > > I realize that there are still a bunch of unresolved issues > in pulling this off. But ignoring them for now... I agree the same solution should work for all these methods. But I don't think we'll get to the point where the signature of equals() or contains() simply accepts Object. Several major concerns: - Boxing. If these methods accept Object, there is going to be some degree of boxing that we can't eliminate. Whether this is "some" or "a lot", I can't imagine getting it down to the point where we're comfortable. - Intrusion. Do we really want to ask authors to deal with Object in Complex.equals()? I would think these methods would want to start with a V and go from there, not have to reason about "if its anything other than a boxed V, forget about it, otherwise cast and unbox." This is not logic we want the user to have to write for each of these methods. What we want, I think, is for the signature of those methods to be: - x(Object) // for reference instantiations - x(T) // for value instantiations That Object is the erasure of T is a powerful connection we can hang our hat on here. I think there are at least three linguistic approaches to rescuing these methods: - contravariant type args () - some sort of peeling that treats x(T) and x(Object) as separate methods, but usually defaults/bridges one of them, so you just have to implement the appropriate one - some way of expressing a signature that means "T when a value, or Object when a reference" All of these have cons, but we've got a long enough list to suggest that there is *a* solution here, and maybe there's a better one if we pull on that string some more. So let's assume there's *some* way to write equals/contains/etc so the right things happen. Your list above stands, except that there's still some degree of migration. > One natural follow-on question is that if we can anyfy contains, why > can't we do so for containsAll(Collection)? And similarly for > removeAll(Collection), retainAll(Collection). In other words, > is this or some variant allowed? > boolean containsAll(Collection) Good thought! Gavin and I bashed our heads against this one for a while about a year ago. First, note that we only have three such methods: remove/retain/containsAll. And we can "retire" two of them as being inferior to removeIf. Which means there's just one method here to rescue. If we have vars, I think we can do the same trick. But the other tricks don't work as well, because of a (sensible but frustrating) limitation of old generics interop -- if you have a method with generics: void foo(T t) void moo(Foo f) you can do a "raw override" void foo(Object t) // acceptable raw override void moo(Foo f) // acceptable raw override and that's fine, but you can't do the same with a wildcard: void moo(Foo f) // not OK So this wouldn't be source-compatible for existing subclasses of Collection. However, its possible that the third variant in our candidate list above -- which amounts to some way of writing the dependent type "if T is erased, then the bound of T, otherwise T" -- might be able to get us here. Or not. If this is the worst of our problems, we have already won. > If so, the main remaining questions surround optionality of results, > that I'll answer separately. Right, there's a real space of API design here. > Doing nothing about List.remove(index) seems to be legal option. Yes, that's a legal option (just as today, you can overload foo(T) and foo(String)). Not sure if it *should* be a legal option (at the very least, the compiler should warn you of this, as it should also probably with overloads that fail to follow a meet rule.) > No > existing code will encounter an ambiguity that is not already present > due to autoboxing (for List). New code using or implementing > List will need some way to disambiguate. But I think that some > syntax will be needed to allow anyway. It might be nice introduce > method removeAt to reduce need to use this syntax, but doesn't seem > necessary? Can you expand on what you might want for disambiguation here? > About the two Collection toArray() methods: > > The no-arg version must return Object[]. I don't see how anyfying (in > any way) can guarantee compatible results. The T[] toArray(T[] > array) version has worse problems: most current implementations use > reflection if the argument array is not big enough (because there is > no syntax for "new T[n]"). I don't see offhand how to compatibly > mangle reflective code. Plus, the spec explicitly says that if the > array is too large, a null is appended to elements. Null is of course > not a legal value for non-ref types. I think "null" can be compatibly replaced with "the default value for the type", which is the same as "null" for all existing code. So that's not a blocker. Reflection is harder, but its quite possible that this will come out in the "specialization wash". If we can have an anyfied version of Arrays.copyOf -- which seems doable -- then I think that problem goes away too. That said, maybe the second version of toArray() should be abandoned in the ref layer for compatibility only, and we should add the new total method T[] toArray(IntFunction generator) as we did with streams. (I think we should introduce this method regardless, actually, for all the reasons that came up when we were discussing it for streams. This is not a method we could have (credibly) had in 1.2, but with lambdas in the language, its kind of a no brainer.) > I don't see a good alternative to leaving both forms of toArray as-is, > and to box results -- requiring that even custom non-ref > implementations do so. But this suggests that we should find some > other way (possibly in a utility class) to create a val-type array of > elements in a val-type collection. Speaking only about *signatures* now, I think the same techniques that allow us to rescue contains(Object) may do the same for toArray(). - U[] toArray() could work; - peeling into separate Object[] toArray() for ref / T[] toArray() for val could work; - expressing the dependent type (T.erased ? T.bound : T) would also work. From dl at cs.oswego.edu Wed Dec 23 16:13:26 2015 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 23 Dec 2015 11:13:26 -0500 Subject: Migrating methods in Collections In-Reply-To: <567AC339.3000805@oracle.com> References: <56719E1F.2050504@oracle.com> <5671BEC8.8010508@oracle.com> <5679AC10.50109@oracle.com> <567AAB65.6070508@cs.oswego.edu> <567AC339.3000805@oracle.com> Message-ID: <567AC826.9060206@cs.oswego.edu> Isolating one small issue for now: On 12/23/2015 10:52 AM, Brian Goetz wrote: >> Doing nothing about List.remove(index) seems to be legal option. > > Yes, that's a legal option (just as today, you can overload foo(T) and > foo(String)). Not sure if it *should* be a legal option (at the very least, the > compiler should warn you of this, as it should also probably with overloads that > fail to follow a meet rule.) > >> No >> existing code will encounter an ambiguity that is not already present >> due to autoboxing (for List). New code using or implementing >> List will need some way to disambiguate. But I think that some >> syntax will be needed to allow anyway. It might be nice introduce >> method removeAt to reduce need to use this syntax, but doesn't seem >> necessary? > > Can you expand on what you might want for disambiguation here? > Not sure; possibly nothing considering that users already (since jdk5) live with this compiling without warning: import java.util.*; public class RemoveInteger { public static void main(String[] args) { List c = new ArrayList(); c.add(1); c.remove(1); System.out.println(c); } } Running ... Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:659) at java.util.ArrayList.remove(ArrayList.java:495) at RemoveInteger.main(RemoveInteger.java:8) From peter.levart at gmail.com Wed Dec 23 16:30:55 2015 From: peter.levart at gmail.com (Peter Levart) Date: Wed, 23 Dec 2015 17:30:55 +0100 Subject: Equality In-Reply-To: <5675880D.4060503@oracle.com> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> <56746AE2.9070403@cs.oswego.edu> <9FA0A474-0CE4-4F13-BE46-5C4515D2CD94@oracle.com> <56757F8D.2030304@cs.oswego.edu> <5675880D.4060503@oracle.com> Message-ID: <567ACC3F.1030303@gmail.com> Hi, Sometimes I miss a feature in generic type declarations where one could refer to a type variable of a super type without repeating it in the declaration of the generic type. For example, currently one has to write: interface Snapshooter> { C snapshot(); boolean addElement(E e); } Snapshooter> strings = ...; List snapshot1 = strings.snapshot(); strings.addElement("abc"); List snapshot2 = strings.snapshot(); What if E could be implicitly declared like: interface Snapshooter> { C snapshot(); boolean addElement(E e); } Snapshooter> strings = ...; Which would be similar to declaring: interface Snapshooter> { ... } ...with added benefit that one could refer to E from Collection. I don't know if this could be soundly incorporated into the language type system, but if it could be, and if interfaces could also be implemented by value types, then... On 12/19/2015 05:38 PM, Brian Goetz wrote: >> Anyfy equals, and adjust default implementation of boolean >> T.equals(any x) > > I think we may be talking past each other (while basically saying the > same thing from opposite directions.) > > "any" is not a type; it is a modifier that affects the domain of type > variables. So we don't have (and I think we don't want) a meaning for > equals(any x). But what we do want, is a way of expressing "I'll take > anything which could be on the other side of the == operator with me." > For refs, that's Object; for a value type V, that's just V. > > Where we had gotten with the / superation idiom (not > suggesting either of these syntaxes is great) is being able to express: > > - If T is value, then T, else the erasure of T (usually object) ** > > I'll write this as Sup for short. The convenient thing about > Sup is that it conveniently collapses to Object in the places where > we want Object, so we could define contains/remove as > > contains(Sup) > > and contains will always bottom out at equals(), so equals() similarly > needs to be > > equals(Sup) > > If this is a valid approach (and I think its the best one we've got so > far), then we're looking for how to spell Sup (in all of: type > system, language syntax, bytecode descriptors.) ....there could be a special interface: public interface Any> { boolean equals(T x); } ...implemented by Object: public class Object implements Any {...} ...and implicitly implemented by every value type: public value ValueType [ implements Any ] {...} Sup from Collection could then be expressed as: public interface Collection> extends Iterable { boolean contains(T x); ... Regards, Peter > >> I still think that doing something like this removes the need >> to specially deal with Collection.contains and related methods. > > I don't see it yet; those signatures are still currently > contains(Object), which isn't appropriate for value types. So we have > to do *something*. > > > > > ** There's a lot of sloppiness in the ref/val distinction, which is > going to need to be cleaned up. Sometimes when we say ref/val, we > mean "erased/reified". Sometimes we mean "polymorphic/monomorphic". > Sometimes we mean "nullable/non-nullable." From brian.goetz at oracle.com Wed Dec 23 17:26:05 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 23 Dec 2015 12:26:05 -0500 Subject: Equality In-Reply-To: <567ACC3F.1030303@gmail.com> References: <56719E1F.2050504@oracle.com> <56742B2F.3000809@cs.oswego.edu> <56743A76.6000007@oracle.com> <56744544.1050603@cs.oswego.edu> <567446F3.7000904@oracle.com> <56746AE2.9070403@cs.oswego.edu> <9FA0A474-0CE4-4F13-BE46-5C4515D2CD94@oracle.com> <56757F8D.2030304@cs.oswego.edu> <5675880D.4060503@oracle.com> <567ACC3F.1030303@gmail.com> Message-ID: <567AD92D.9070302@oracle.com> > ....there could be a special interface: > > public interface Any> { > boolean equals(T x); > } This is exactly the current thinking -- though we don't call it Any, we call it Objectible. > ...implemented by Object: > > public class Object implements Any {...} We were thinking that this would be Object implements *raw* Objectible, but yes. > ...and implicitly implemented by every value type: Yes. > Sup from Collection could then be expressed as: > > public interface Collection> extends > Iterable { > boolean contains(T x); Clever. This is yet another way to express the dependent type "ref T ? Object : T" alluded to in the last mail. Ultimately, I do think that the API problem (and several others) boil down to finding a non-scary way to refer to this type (and anything with the word "dependent" in it fails the scary test.) From brian.goetz at oracle.com Wed Dec 23 18:52:46 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 23 Dec 2015 13:52:46 -0500 Subject: Migrating methods in Collections In-Reply-To: <567AC339.3000805@oracle.com> References: <56719E1F.2050504@oracle.com> <5671BEC8.8010508@oracle.com> <5679AC10.50109@oracle.com> <567AAB65.6070508@cs.oswego.edu> <567AC339.3000805@oracle.com> Message-ID: <567AED7E.5080903@oracle.com> > What we want, I think, is for the signature of those methods to be: > - x(Object) // for reference instantiations > - x(T) // for value instantiations > > That Object is the erasure of T is a powerful connection we can hang > our hat on here. I think there are at least three linguistic > approaches to rescuing these methods: > > - contravariant type args () > - some sort of peeling that treats x(T) and x(Object) as separate > methods, but usually defaults/bridges one of them, so you just have to > implement the appropriate one > - some way of expressing a signature that means "T when a value, or > Object when a reference" > > All of these have cons, but we've got a long enough list to suggest > that there is *a* solution here, and maybe there's a better one if we > pull on that string some more. > Let me summarize the options here in slightly more detail. Super-ivars. If we could express contravariant method tvars, then we could write: interface Objectible { boolean equals(U u); } where class Object implements Objectible { boolean equals(U u); } Now, there is a common equals() method which has the signature equals(Object) for reference types and equals(V) for value types. The same applies to Collection.contains and friends. Cons: - contravariance is poisonfor type systems - performance impact of specialized generic methods on critical methods Collapsing method pairs. Here, Objectible has two methods boolean equals(Object o) boolean equals(V other) We can give the former a default for value types (an unboxing bridge) and for reference types, the two collapse to the same signature, and we tell the compiler/VM that this is a feature and that these are actually the same method. This likely will want some way of declaring that these methods are related and intended to be collapsed. For methods like contains(), we have to do the same -- declare a collapsible method pair (with an appropriate ref-default bridge). Users may have to implement both in some cases. Partial refinement. Here, we have a total method boolean contains(V v) and we "refine" it for erased instantiations: boolean contains(Object o) We teach the compiler that it is allowable (perhaps with some additional declaration) to widen the signature of a total method to its erasure in erased instantiations. Honesty. Here, we find a non-scary way of denoting the dependent type (erased T ? erasure(T) : T), which I believe is *actually* the type we want (and most of the other approaches are an attempt to avoid naming that-which-must-not-be-named). The complexity of this comes from the fact that ref instantiations are variant (and we are using Object as a stand-in for , both for historical compatibility purposes and because type inference was so weak in Java 5) and value instantiations are invariant. Something has to bridge the gap -- either providing the right variance tools (super-ivars), pretending to split the methods (collapsing pairs/partial refinement), or recognizing a weird-but-honest "maybe erased, maybe not" type. As an strawman purely for expository purposes, suppose we call this type T.eq (to evoke "the type of things that T could be compared to for equality" -- finding a non-sucky syntax is obviously needed). Then we have: boolean equals(T.eq other) boolean contains(T.eq other) T.eq[] toArray() etc. This is the cleanest solution from a type system perspective: it captures the actual type we want, and it works well for toArray as well as the others (and when we get to the discussion about "how do I code the equals() method", it will show up again, and when we get to the part about representing a generic class in bytecode, it will show up yet again.) Peter's suggestion is another path for getting to this approach. It is safe to assign a T to a T.eq; it is an unchecked conversion to go from T.eq to T. We've scribbled with ways to denote this type: T.something, ~T, |T|, Something, Something[T], ? something T, etc. None seem that great (though the first form seems the most promising, if an appropriately evocative "something" could be identified.) But if we can see our way clear to admitting that this is actually the type we're looking for, many other problems (including several we've not discussed yet) simply fold away. *Approach** * *Pros** * *Cons** * Super-ivars Straightforward semantics. Will be viewed as plugging a hole in existing language. Much bigger hammer than needed. Significant potential complexity (including potential undecidability.) Potential performance challenges. Method pairs Reasonable partial-method modeling of the problem. Users may have to implement both methods. Likely will need a linguistic mechanism to indicate that two methods form a pair. Partial refinement Reasonable partial-method modeling of the problem. Users may have to implement both methods. Likely will need a linguistic mechanism to indicate that two methods form a pair. Complicates semantics of partial methods. Naming it Honest modeling of the problem. There is only one method, so users only have to implement one method, and clients only have to reason about one method. No need to express relationships between two methods. The need for something like this also shows up in method implementations. It's new and weird, and will take users some time to get used to (but possibly mitigated if we had a better name.) Of these, I think the method-pairs approach dominates the partial refinement approach, and I'd like to avoid the super-ivars approach (because it trades one problem for several others.) I think the "honest" approach has the best overall characteristics (by a wide margin) -- *if* we can identify a syntactic form that doesn't make developers run away screaming, which is a big if. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Dec 23 21:47:22 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 23 Dec 2015 16:47:22 -0500 Subject: toArray (was: Migrating methods in Collections) In-Reply-To: <567AED7E.5080903@oracle.com> References: <56719E1F.2050504@oracle.com> <5671BEC8.8010508@oracle.com> <5679AC10.50109@oracle.com> <567AAB65.6070508@cs.oswego.edu> <567AC339.3000805@oracle.com> <567AED7E.5080903@oracle.com> Message-ID: <567B166A.5030408@oracle.com> Doug pointed out something to me today that hadn't occurred to me -- that callers of Collection.toArray() might actually *want* an Object[], not an int[]. Does anyone have opinions on usages that would inform this one way or the other? Now that we have streams, we can easily get both: Collection c = ... c.stream().toArray() // int[] c.stream().boxed().toArray() // Object[] c.stream().boxed().toArray(Integer[]::new) // Integer[] So it would even be possible (though a little weird) to completely punt on migrating toArray() and tell users to go through streams if they want an array. From dl at cs.oswego.edu Thu Dec 24 13:24:44 2015 From: dl at cs.oswego.edu (Doug Lea) Date: Thu, 24 Dec 2015 08:24:44 -0500 Subject: Migrating methods in Collections In-Reply-To: <567AED7E.5080903@oracle.com> References: <56719E1F.2050504@oracle.com> <5671BEC8.8010508@oracle.com> <5679AC10.50109@oracle.com> <567AAB65.6070508@cs.oswego.edu> <567AC339.3000805@oracle.com> <567AED7E.5080903@oracle.com> Message-ID: <567BF21C.7020204@cs.oswego.edu> You might look at discussions so far as conceptually adding a super-interface to Collection, but jamming all the methods into Collection itself. It might be easier to pretend otherwise for a while. If we had some really good names for these, we might even prefer this way. Here's sketch of Collection (Not yet for List and Map) interface AnyCollection { boolean isEmpty(); boolean contains(E e); // not Object boolean add(E e); boolean remove(E e); // not Object boolean removeIf(Predicate filter); void clear(); Iterator iterator(); Stream stream(); Stream parallelStream(); boolean addAll(AnyCollection c); boolean containsAll(AnyCollection c); // not Collection boolean removeAll(AnyCollection c); // not Collection boolean retainAll(AnyCollection c); // not Collection // to address other issues: long elementCount(); // not size() AnyCollection adding(E e); AnyCollection removing(E e); } interface Collection extends AnyCollection { int size(); boolean contains(Object o); // contravariant arg boolean remove(Object o); // contravariant arg Object[] toArray(); T[] toArray(T[] a); boolean equals(Object o); // declare for sake of Collection spec int hashCode(); // declare for sake of Collection spec boolean containsAll(Collection c); boolean removeAll(Collection c); boolean retainAll(Collection c); } From brian.goetz at oracle.com Thu Dec 24 16:01:22 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 24 Dec 2015 11:01:22 -0500 Subject: Migrating methods in Collections In-Reply-To: <567BF21C.7020204@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <5671BEC8.8010508@oracle.com> <5679AC10.50109@oracle.com> <567AAB65.6070508@cs.oswego.edu> <567AC339.3000805@oracle.com> <567AED7E.5080903@oracle.com> <567BF21C.7020204@cs.oswego.edu> Message-ID: <567C16D2.2040400@oracle.com> Cowardly of you to not put some form of toArray() in AnyCollection :) I still think the logical signatures here would be: E[] toArray() toArray(IntFunction generator) These conflict with their counterparts in Collection, but in exactly the same way as contains/remove do. Separately, one might wonder (and you should wonder) "What does ? extends/super T mean when T is an avar?" The answer is: another dependent type, namely: ? extends T = if (ref T) then ? extends T else T This may look circular, but is not; we already have a definition of "? extends T" for a reference tvar. So when Foo is instantiated with a value type, then the variance on T folds away. The point is: get used to this reasoning of "If T is a ref, then this, otherwise that." It's everywhere. (We can choose to be honest about it, or we can try and paper over it.) On 12/24/2015 8:24 AM, Doug Lea wrote: > > You might look at discussions so far as conceptually adding a > super-interface to Collection, but jamming all the methods > into Collection itself. It might be easier to pretend otherwise > for a while. If we had some really good names for these, we might > even prefer this way. > > Here's sketch of Collection (Not yet for List and Map) > > interface AnyCollection { > boolean isEmpty(); > boolean contains(E e); // not Object > > boolean add(E e); > boolean remove(E e); // not Object > boolean removeIf(Predicate filter); > void clear(); > > Iterator iterator(); > Stream stream(); > Stream parallelStream(); > > boolean addAll(AnyCollection c); > boolean containsAll(AnyCollection c); // not > Collection > boolean removeAll(AnyCollection c); // not > Collection > boolean retainAll(AnyCollection c); // not > Collection > > // to address other issues: > long elementCount(); // not size() > AnyCollection adding(E e); > AnyCollection removing(E e); > } > > interface Collection extends AnyCollection { > int size(); > boolean contains(Object o); // contravariant arg > boolean remove(Object o); // contravariant arg > > Object[] toArray(); > T[] toArray(T[] a); > > boolean equals(Object o); // declare for sake of Collection spec > int hashCode(); // declare for sake of Collection spec > > boolean containsAll(Collection c); > boolean removeAll(Collection c); > boolean retainAll(Collection c); > } > From dl at cs.oswego.edu Thu Dec 24 16:17:11 2015 From: dl at cs.oswego.edu (Doug Lea) Date: Thu, 24 Dec 2015 11:17:11 -0500 Subject: Migrating methods in Collections In-Reply-To: <567C16D2.2040400@oracle.com> References: <56719E1F.2050504@oracle.com> <5671BEC8.8010508@oracle.com> <5679AC10.50109@oracle.com> <567AAB65.6070508@cs.oswego.edu> <567AC339.3000805@oracle.com> <567AED7E.5080903@oracle.com> <567BF21C.7020204@cs.oswego.edu> <567C16D2.2040400@oracle.com> Message-ID: <567C1A87.3010502@cs.oswego.edu> On 12/24/2015 11:01 AM, Brian Goetz wrote: > Cowardly of you to not put some form of toArray() in AnyCollection :) > My intent was to keep at AnyX level only those things that would never get you into null, boxing, or erasure trouble. (Which led to some some jdk8 deja vu of jdk5 deja vu!) I don't know of best replacement for toArray or even whether it should be mandated. But I tried pass two also with List and Map sketches. Not meant as a proposal, just as a way to help focus on current and upcoming issues. Including for example Pair vs Map.Entry. interface AnyCollection { boolean isEmpty(); boolean contains(E e); // not Object boolean add(E e); boolean remove(E e); // not Object boolean removeIf(Predicate filter); void clear(); Spliterator spliterator(); // iterator() omitted Stream stream(); Stream parallelStream(); boolean addAll(AnyCollection c); boolean containsAll(AnyCollection c); // not Collection boolean removeAll(AnyCollection c); // not Collection boolean retainAll(AnyCollection c); // not Collection // to address other issues: long elementCount(); // not size() AnyCollection adding(E e); AnyCollection removing(E e); } interface Collection extends AnyCollection { int size(); boolean contains(Object o); // contravariant arg boolean remove(Object o); // contravariant arg Iterator iterator(); Object[] toArray(); T[] toArray(T[] a); boolean equals(Object o); // declare for sake of Collection spec int hashCode(); // declare for sake of Collection spec boolean containsAll(AnyCollection c); boolean removeAll(AnyCollection c); boolean retainAll(AnyCollection c); } interface AnyList extends AnyCollection { void replaceAll(UnaryOperator operator); void sort(Comparator c); E at(long index); E setAt(long index, E element); void addAt(long index, E element); boolean addAllAt(long index, AnyCollection c); E removeAt(long index); long findFirst(E e); long findLast(E e); // subList? } interface List extends AnyList { E get(int index); E set(int index, E element); void add(int index, E element); boolean addAll(int index, AnyCollection c); E remove(int index); int indexOf(Object o); int lastIndexOf(Object o); ListIterator listIterator(); ListIterator listIterator(long index); List subList(int fromIndex, int toIndex); } interface AnyMap { // too bad not: extends AnyCollection> boolean isEmpty(); long mappingCount(); // as above; not size() boolean containsKey(K key); boolean containsValue(V value); Optional at(K key); V at(K key, V defaultValue); boolean putAt(K key, V value); V installAt(K key, V value); // putIfAbsent, but return current val boolean replaceAt(K key, V value); boolean replaceAt(K key, V oldValue, V newValue); boolean removeAt(K key); boolean removeAt(K key, V value); void clear(); Spliterator keySpliterator(); Stream keyStream(); Spliterator valueSpliterator(); Stream valueStream(); Spliterator> spliterator(); // Pair? Stream> stream(); void putAll(AnyMap m); void forEach(BiConsumer action); void replaceAll(BiFunction fun); // Might need renaming because of null return as sentinel V computeIfAbsent(K key, Function fun); V computeIfPresent(K key, BiFunction fun); V compute(K key, BiFunction fun); V merge(K key, V value, BiFunction fun); } interface Map extends AnyMap { int size(); boolean containsKey(Object key); boolean containsValue(Object value); V get(Object key); V getOrDefault(Object key, V defaultValue); V put(K key, V value); V putIfAbsent(K key, V value); V remove(Object key); boolean remove(Object key, Object value); V replace(K key, V value); Set keySet(); Collection values(); Set> entrySet(); boolean equals(Object o); int hashCode(); } From brian.goetz at oracle.com Thu Dec 24 17:43:03 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 24 Dec 2015 12:43:03 -0500 Subject: Migrating methods in Collections In-Reply-To: <567C1A87.3010502@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <5671BEC8.8010508@oracle.com> <5679AC10.50109@oracle.com> <567AAB65.6070508@cs.oswego.edu> <567AC339.3000805@oracle.com> <567AED7E.5080903@oracle.com> <567BF21C.7020204@cs.oswego.edu> <567C16D2.2040400@oracle.com> <567C1A87.3010502@cs.oswego.edu> Message-ID: <567C2EA7.1020901@oracle.com> Since we're redesigning... > boolean removeAll(AnyCollection c); // not > Collection > boolean retainAll(AnyCollection c); // not > Collection Arguably better handled by remove(Predicate) > long findFirst(E e); > long findLast(E e); Arguably better as findXxx(Predicate) From brian.goetz at oracle.com Thu Dec 24 18:14:22 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 24 Dec 2015 13:14:22 -0500 Subject: Migrating methods in Collections In-Reply-To: <567C1A87.3010502@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <5671BEC8.8010508@oracle.com> <5679AC10.50109@oracle.com> <567AAB65.6070508@cs.oswego.edu> <567AC339.3000805@oracle.com> <567AED7E.5080903@oracle.com> <567BF21C.7020204@cs.oswego.edu> <567C16D2.2040400@oracle.com> <567C1A87.3010502@cs.oswego.edu> Message-ID: <567C35FE.7000908@oracle.com> Teasing apart the differences: Essential differences: - Arguments to contains/remove (Object vs E). This comes down to variance; for value-instantiated generics, V is the only logical choice, whereas for reference instantiated generics (erased or not!) we have to contend with ? extends/super T as well. And because we don't have inference variables, sometimes we use Object instead of . - Type arguments to xxxAll (Collection vs Collection.) Same basic problem, but the treatment is different, because of irregularities in generics. - Interaction with arrays. Fundamental mismatch; (reference) arrays are covariant, generics are invariant. Made more fuzzy by the lack of inference variables. That value arrays are also invariant seems an opportunity to make things fit better. - Methods that use null to signal absence (Map.get(), Queue.poll()). Slightly related: Map.put() returns what was previously there, but for value maps, there's no obvious sentinel for "not there" other than the default value, which means that the return value is mostly useless. Opportunistic differences: - widening index size (bleeds into size()) - Possibly replace some methods with more flexible lambda-powered counterparts (removeAll, indexOf, toArray(generator)) Implementation possibilities: - The T.eq type (today I am calling it T.rasure, tomorrow I'll call it something else) cleanly addresses the contains/remove and toArray issues, as well as Object.equals. - The xxxAll problems are messy, because there is very little room for varying a generic type in a signature when overriding. However, I could imagine some wiggle room here for interacting with the T.rasure types. Alternately we migrate to new names. - The null-signaling methods can be migrated to new total methods. - Migrating value-consuming methods to Predicate-consuming methods is a pure API evolution choice; we can keep the spirit of the old sigs, or upgrade them as we see fit. - Widening index sizes will require some as-yet-undisclosed firepower, but it is essentially the same trick that would allow us to migrate reference-Optional to value-Optional. On 12/24/2015 11:17 AM, Doug Lea wrote: > On 12/24/2015 11:01 AM, Brian Goetz wrote: >> Cowardly of you to not put some form of toArray() in AnyCollection :) >> > > My intent was to keep at AnyX level only those things that > would never get you into null, boxing, or erasure trouble. > (Which led to some some jdk8 deja vu of jdk5 deja vu!) > I don't know of best replacement for toArray or > even whether it should be mandated. But I tried pass two > also with List and Map sketches. Not meant as a proposal, > just as a way to help focus on current and upcoming issues. > Including for example Pair vs Map.Entry. > > interface AnyCollection { > boolean isEmpty(); > boolean contains(E e); // not Object > > boolean add(E e); > boolean remove(E e); // not Object > boolean removeIf(Predicate filter); > void clear(); > > Spliterator spliterator(); // iterator() omitted > Stream stream(); > Stream parallelStream(); > > boolean addAll(AnyCollection c); > boolean containsAll(AnyCollection c); // not > Collection > boolean removeAll(AnyCollection c); // not > Collection > boolean retainAll(AnyCollection c); // not > Collection > > // to address other issues: > long elementCount(); // not size() > AnyCollection adding(E e); > AnyCollection removing(E e); > } > > interface Collection extends AnyCollection { > int size(); > boolean contains(Object o); // contravariant arg > boolean remove(Object o); // contravariant arg > > Iterator iterator(); > > Object[] toArray(); > T[] toArray(T[] a); > > boolean equals(Object o); // declare for sake of Collection spec > int hashCode(); // declare for sake of Collection spec > > boolean containsAll(AnyCollection c); > boolean removeAll(AnyCollection c); > boolean retainAll(AnyCollection c); > } > > > interface AnyList extends AnyCollection { > void replaceAll(UnaryOperator operator); > void sort(Comparator c); > E at(long index); > E setAt(long index, E element); > void addAt(long index, E element); > boolean addAllAt(long index, AnyCollection c); > E removeAt(long index); > long findFirst(E e); > long findLast(E e); > // subList? > } > > interface List extends AnyList { > E get(int index); > E set(int index, E element); > void add(int index, E element); > boolean addAll(int index, AnyCollection c); > E remove(int index); > int indexOf(Object o); > int lastIndexOf(Object o); > ListIterator listIterator(); > ListIterator listIterator(long index); > List subList(int fromIndex, int toIndex); > } > > > interface AnyMap { // too bad not: extends AnyCollection> > boolean isEmpty(); > long mappingCount(); // as above; not size() > > boolean containsKey(K key); > boolean containsValue(V value); > > Optional at(K key); > V at(K key, V defaultValue); > > boolean putAt(K key, V value); > V installAt(K key, V value); // putIfAbsent, but return current val > boolean replaceAt(K key, V value); > boolean replaceAt(K key, V oldValue, V newValue); > > boolean removeAt(K key); > boolean removeAt(K key, V value); > void clear(); > > Spliterator keySpliterator(); > Stream keyStream(); > Spliterator valueSpliterator(); > Stream valueStream(); > Spliterator> spliterator(); // Pair? > Stream> stream(); > > void putAll(AnyMap m); > void forEach(BiConsumer action); > void replaceAll(BiFunction fun); > > // Might need renaming because of null return as sentinel > > V computeIfAbsent(K key, Function fun); > V computeIfPresent(K key, BiFunction extends V> fun); > V compute(K key, BiFunction fun); > V merge(K key, V value, BiFunction V> fun); > } > > interface Map extends AnyMap { > int size(); > boolean containsKey(Object key); > boolean containsValue(Object value); > V get(Object key); > V getOrDefault(Object key, V defaultValue); > V put(K key, V value); > V putIfAbsent(K key, V value); > V remove(Object key); > boolean remove(Object key, Object value); > V replace(K key, V value); > > Set keySet(); > Collection values(); > Set> entrySet(); > > boolean equals(Object o); > int hashCode(); > } From dl at cs.oswego.edu Thu Dec 24 18:36:15 2015 From: dl at cs.oswego.edu (Doug Lea) Date: Thu, 24 Dec 2015 13:36:15 -0500 Subject: Migrating methods in Collections In-Reply-To: <567C16D2.2040400@oracle.com> References: <56719E1F.2050504@oracle.com> <5671BEC8.8010508@oracle.com> <5679AC10.50109@oracle.com> <567AAB65.6070508@cs.oswego.edu> <567AC339.3000805@oracle.com> <567AED7E.5080903@oracle.com> <567BF21C.7020204@cs.oswego.edu> <567C16D2.2040400@oracle.com> Message-ID: <567C3B1F.2070307@cs.oswego.edu> On 12/24/2015 11:01 AM, Brian Goetz wrote: > I still think the logical signatures here would be: > E[] toArray() > toArray(IntFunction generator) My thought was: Let them eat streams! Which already supports: Object[] toArray(); A[] toArray(IntFunction generator); On the other hand, IntStream adds: int[] toArray(); And similarly for LongStream, DoubleStream, but nothing that accommodates arbitrary Value-Stream, so the issue might need to be addressed somewhere somehow. > >> boolean removeAll(AnyCollection c); // not >> Collection >> boolean retainAll(AnyCollection c); // not >> Collection > > Arguably better handled by remove(Predicate) Sure. I agree there's no good rationale for these in Collection super-interface. (On the off chance that we actually split them...) > >> long findFirst(E e); >> long findLast(E e); > > Arguably better as findXxx(Predicate) I agree about adding predicate versions. But people will complain about requiring a lambda in plain version that otherwise ought to be simple enough to compile into optimal machine instruction loop for ArrayLists From dl at cs.oswego.edu Thu Dec 24 19:32:11 2015 From: dl at cs.oswego.edu (Doug Lea) Date: Thu, 24 Dec 2015 14:32:11 -0500 Subject: Migrating methods in Collections In-Reply-To: <567C35FE.7000908@oracle.com> References: <56719E1F.2050504@oracle.com> <5671BEC8.8010508@oracle.com> <5679AC10.50109@oracle.com> <567AAB65.6070508@cs.oswego.edu> <567AC339.3000805@oracle.com> <567AED7E.5080903@oracle.com> <567BF21C.7020204@cs.oswego.edu> <567C16D2.2040400@oracle.com> <567C1A87.3010502@cs.oswego.edu> <567C35FE.7000908@oracle.com> Message-ID: <567C483B.20702@cs.oswego.edu> On 12/24/2015 01:14 PM, Brian Goetz wrote: > Teasing apart the differences: ... > - Methods that use null to signal absence (Map.get(), Queue.poll()). First, notice the convention of using "At" in value-friendly method names. Not always pretty, but easier to remember. (And for the moment, just a challenge for anyone to come up with something better.) For Map, this convention helps with the issue that we'd prefer to use Optional vs boxing but must allow boxing for Map.get(): interface AnyMap { // ... Optional at(K key); V at(K key, V defaultValue); } interface Map extends AnyMap { // ... V get(Object key); V getOrDefault(Object key, V defaultValue); } For Queue, there's no good reason for having an Optional-returning method in addition to added default-returning version, so: interface AnyQueue extends AnyCollection { E remove(); E element(); E poll(E defaultValue); E peek(E defaultValue); } interface Queue extends AnyQueue { E poll(); E peek(); } Deque is messier but similar: interface AnyDeque extends AnyQueue { void addFirst(E e); void addLast(E e); boolean offerFirst(E e); boolean offerLast(E e); E removeFirst(); E removeLast(); E getFirst(); E getLast(); void push(E e); E pop(); E pollFirst(E defaultValue); E pollLast(E defaultValue); E peekFirst(E defaultValue); E peekLast(E defaultValue); boolean removeFirstOccurrence(E e); boolean removeLastOccurrence(E e); } interface Deque extends AnyDeque, Queue { E pollFirst(); E pollLast(); E peekFirst(); E peekLast(); boolean removeFirstOccurrence(Object o); boolean removeLastOccurrence(Object o); Iterator descendingIterator(); } From brian.goetz at oracle.com Thu Dec 24 19:34:45 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 24 Dec 2015 14:34:45 -0500 Subject: Migrating methods in Collections In-Reply-To: <567C3B1F.2070307@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <5671BEC8.8010508@oracle.com> <5679AC10.50109@oracle.com> <567AAB65.6070508@cs.oswego.edu> <567AC339.3000805@oracle.com> <567AED7E.5080903@oracle.com> <567BF21C.7020204@cs.oswego.edu> <567C16D2.2040400@oracle.com> <567C3B1F.2070307@cs.oswego.edu> Message-ID: <567C48D5.2000302@oracle.com> >> >>> long findFirst(E e); >>> long findLast(E e); >> >> Arguably better as findXxx(Predicate) > > I agree about adding predicate versions. > But people will complain about requiring a lambda in plain version > that otherwise ought to be simple enough to compile into > optimal machine instruction loop for ArrayLists > The usual (bad) argument against overloading findXxx(E) and findXxx(Predicate) is that null is accepted by both signatures. But I have no problem with both, and both have reasonable defaults so can be freely added. From brian.goetz at oracle.com Thu Dec 24 19:39:40 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 24 Dec 2015 14:39:40 -0500 Subject: Migrating methods in Collections In-Reply-To: <567C483B.20702@cs.oswego.edu> References: <56719E1F.2050504@oracle.com> <5671BEC8.8010508@oracle.com> <5679AC10.50109@oracle.com> <567AAB65.6070508@cs.oswego.edu> <567AC339.3000805@oracle.com> <567AED7E.5080903@oracle.com> <567BF21C.7020204@cs.oswego.edu> <567C16D2.2040400@oracle.com> <567C1A87.3010502@cs.oswego.edu> <567C35FE.7000908@oracle.com> <567C483B.20702@cs.oswego.edu> Message-ID: <567C49FC.4040004@oracle.com> > interface AnyMap { // ... > Optional at(K key); > V at(K key, V defaultValue); > } An alternative to at(key, defaultValue) is tryAt(K, Consumer) -- though the name tryAt is awful. (I was considering map() and tryMap() as the new names for get().) Just to put it on the record, Optional-bearing at() has a problem when the map can have null values -- but given that there are alternatives (default version, consumer version, legacy get()) in the cases where this might happen, I don't see this as a showstopper. > For Queue, there's no good reason for having an Optional-returning > method in addition to added default-returning version, so: Why not? Optional offers a lot of flexibility, since you have many options for what to do with the possibly-nonexistent value (orElse, orElseThrow, map, flatMap, ifPresent.) Seems a natural building block, and a sharper tool than poll(defaultValue). From brian.goetz at oracle.com Fri Dec 25 00:13:56 2015 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 24 Dec 2015 19:13:56 -0500 Subject: Type-dependent operations Message-ID: <567C8A44.5080203@oracle.com> The previous discussion topic (which is still not remotely finished, you're not off the hook yet!) centered on migrating generic APIs. A related topic is that of migrating the /implementations/ of these APIs. This exploration has been informed by the prototype of Collections and Streams. In the lucky case, no changes are needed to the bodies of methods after anyfying the API -- just adding "any" is all you need. However, not all cases are so lucky. Here's a list of places where simply recompiling existing ref-generic as any-generic could run into problems. *Nullity. *References are nullable, values are not. (We'll have a separate discussion for "nullable values", which may be desirable for migration compatibility.) *Variance. *References are polymorphic; values are not. *Identity. *References have identity; values do not. For example, this means that values cannot be used as the lock object for a synchronized block. *Object methods. *Value types will almost certainly have some form of equals, hashCode, toString, and getClass; they will almost certainly not have wait, notify, or notifyAll methods, and probably not clone either. *Relationship with Object. *All reference types are subtypes of Object; value types are not. Similarly, arrays of reference types are subtypes of Object[]; arrays of value types are not. **Array creation. **Currently, the idiom for creating a T[] array is to create an Object[] array, and statically cast it to T[]. This won't work with value arrays. * Instanceof, casting, and type literals. *Instanceof doesn't permit a parameterized type on the RHS. However, this is not specific enough for specialized types; the runtime class of List is different from List. Casting does permit a parameterized type, but currently this is interpreted as a static cast. Similarly, type literals as currently formulated are not specific enough either. *Wildcards. *The existing wildcard Foo means (and must continue to mean) Foo. When dealing with quantities that might either be a reference or a value, such as an expression of type 'T' where T is an avar, the compiler must be conservative and only allow operations that can be proven safe for either references or values. So it would have to reject, for example, assigning a null to a T, since we don't know that null is a member of the domain for all T. Obviously, we are not going to add value types and any-tvars to the language, and then not adjust the other places where type variables meet other language features. So clearly some of these issues will be addressed by extending the semantics of existing language features. But, we don't necessarily *have* to change anything in order for things to "work". A method in an any-generic class could be "peeled" into a ref version and a value version: class Foo { public void moo(T t) { // existing method body } public void moo(T t) { // alternate method body, that steers clear of restrictions } } But, asking users to write their code twice would be rude, so we want to keep this sort of peeling to a bare minimum, and preserve it as as an "escape hatch". If we're to minimize peeling, it stands to reason that either new linguistic forms need to be added, or existing forms be stretched to accomodate the broadened domain of genericity. Let's take these one at a time. *Nullity. *We can further break this down into assignment to null and comparison to null. Assignment to null is not going to fly. Our current prototype supports the expression T.default, which evaluates to the default value for whatever type T describes. For reference instantiations, this is null; for primitives, this is zero/false. Assignment to null can be replace with assignment to T.default. For comparison with null, there are some options. In the prototype, we currently have a peeled generic method Any.isNull(), which returns false for value invocations. However, even swapping out ==null for Any.isNull() is somewhat intrusive; we can define ==null such that it constant folds to false for value instantiations. Then existing source code is unchanged (and there's no runtime overhead for the null check in value instantiations, since its been folded away.) When we look more closely at the possibility of nullable value types, this will have to be refined. *Variance. *We already fold "? extends T" and "? super T" to T when T is known to be a value type. (More specifically, we treat wildcards bounded by avars as a dependent type, (if erased T then ? extends T else T)). *Identity. *There are a few cases here -- synchronization, reference comparison to an Object, System.identityHashCode. I think it makes sense to reject synchronization, and instead ask for more explicit lock-selection logic (perhaps appealing to a peeled Object lockFor(T) method). For reference comparison to Object, we can treat this as we propose above for comparison to null -- constant fold to false. For System.identityHashCode, we can peel this into something that uses ordinary hashCode for values. *Object methods. *We've already discussed the Objectible interface, which would define the core methods equals, hashCode, toString, and getClass. Other Object methods (wait, notify, notifyAll, and probably clone) would not be available on any-T-valued expressions. (Arguably clone() could be the identity function on values, but this may not be worth it -- cloning is pretty broken.) *Relationship with Object. *Assignment to Object could be accepted as a possibly-autoboxed operation, but I'm not sure this is a great idea -- it might be better to have an explicit toObject() method (maybe even on Objectible). Assignment of T[] to Object[] needs to be rejected, but in most cases Object[] can be replaced with T.erasure[] (just like replacing null with T.default.) **Array creation. **The current prototype supports the expression form new T[n], which downgrades to Object[] when T is a reference type (and issues an unchecked warning.) Alternately, we could provide a reflective method T[] newArray(int), also with an unchecked warning. We can make the unchecked warning go away if the new expression were new T.erasure[n] (or the library version returned T.erasure). * Instanceof, casting, and type literals. *It is straightforward enough to extend instanceof to support "instanceof Foo", and similarly for cast and type literals. We can do the same for the wildcard Foo. Supporting "instanceof Foo" is trickier because T might be erased, and so it might not give you the answer you expect. Currently in a generic class Foo you can ask if something is instanceof raw Foo, or of wildcard Foo. The equivalent question with any-generics is more complicated; you want to express "If I am erased and the other is erased Foo, OR I am not erased and the other is the same instantiation of Foo as me." (This collapses to "do they have the same runtime class", but that's not really what we want to encourage people to write.) Simply extending instanceof to support Foo (even with an unchecked warning) seems insufficient here, because in the erased case, it will say yes when all it can tell is "they're both erased Foo", and it seems like it promises more than it delivers. (And, it should be possible to write a sensible equals() method without unchecked warnings.) But all is not lost! Our friendly dependent type T.erasure saves us here too: if (other instanceof Foo) (This is a slight stretching of the syntax, since we're not really asking if the other is an instance of Foo in the erased case, but only slightly.) We can do the same for casting; I am not yet sure it makes sense to do the same for type literals. *Wildcards. *Code that makes use of Foo will likely want to migrate to using Foo instead. Looking at how many times T.erasure plays into the answer, you can see why I was arguing for it in the context of the API migration -- because with any of the other API approaches, we would still have the same set of problems / unchecked warnings when we get to the method body. Take the equals() method. We would like to be able to write an equals() method once, generically for all instantiations, with no peeling and no unchecked warnings. The T.erasure approach gets us there. If we have a class Box today: class Box { T t; boolean equals(Object o) { if (o instanceof Box) { Box other = (Box) o; if (t == null) return other.t == null; Object otherT = other.t; return t.equals(otherT); } else return false; } } The parts in red are those where erasure is exposed to the programmer; the programmer would like to ask if the other object is a Box, cast it to a Box, and extract its state as a T, but can't do so safely, so we settle for answering a looser question. Here's the same class, anyfied. Red is code that changes from the above. class Box { T t; boolean equals(Object o) { if (o instanceof Box) { Box other = (Box) o; if (t == null) return other.t == null; T.erasure otherT = other.t; return t.equals(otherT); // This is .equals(T.erasure) too } else return false; } } My claim here is: not only is this safe (no unchecked warnings, no heap pollution), and not only is it more generic because the domain of genericity is broadened, but that it is /less polluted //by erasure /(despite the word "erasure" appearing prominently.) By using the T.erasure type, we're able to explicitly say "use the sharpest type you can, modulo erasure" in the instanceof, cast, variable extraction, and equals contexts, and the limitations of our approximations are explicit -- and we get more type checking than we would by manually erasing things to "Object". We're working within the type system, rather than outside it. Overall, with the language features adjusted as described (loosely) herein, we can migrate existing generic code to any-generic in a fairly localized and mechanized manner, with only a few idioms (e.g., locking) requiring any sort of peeling on the part of the user. The (incomplete) prototype of Collections in the Valhalla repo seems consistent with this theory. Oh, and there's one more elephant in this room: serialization. Lots of work will be needed for serialization, which uses Object everywhere ... but that's another day. -------------- next part -------------- An HTML attachment was scrubbed... URL: