From kevinb at google.com Mon May 2 23:34:42 2016 From: kevinb at google.com (Kevin Bourrillion) Date: Mon, 2 May 2016 16:34:42 -0700 Subject: Value types questions & comments In-Reply-To: <2127B453-99E7-4879-9097-F46BE46DB0C3@oracle.com> References: <2127B453-99E7-4879-9097-F46BE46DB0C3@oracle.com> Message-ID: Still thinking On Mon, Apr 11, 2016 at 4:13 PM, Brian Goetz wrote: For example, I see no reason why `int` can?t implement Comparable or > Serializable ... > > We might even take this further ? by actually describing `int` with a > source file (public native class int implements Comparable { ? }) which > might try and smooth out some of the differences, but I wouldn?t hold out a > lot of hope for this being super successful. > I now think that neither of these bits will accomplish much. Implemented types only mean anything to the *boxed* form, but int doesn't even use "that" kind of boxing; it must continue to be boxed to java.lang.Integer as always, and nor is there a way to make Integer "extend Box", which would have bought us a few things, because it has to extend Number (which tragically is not an interface). So, I think you're looking at a taxonomy something like this: Types +-- Reference types +-- Value types +-- Primitives / "builtin value types" +-- "Custom value types" ... and basically we need to diligently use the term "custom value types" instead of "value types" whenever talking about information that doesn't apply to primitives. But, I think that this will also not work out well, because it belies the many commonalities between kinds 1 and 3 that primitives don't share (they have a source file, they get loaded and initialized, they have methods....). I'm back once again to the idea that there are just plain *three different kinds of types*, where value types are largely but not entirely a hybrid of the other two. Now, sure, the more *areas* (such as specialization) in which we can make primitives and value types work identically the better. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Mon May 16 18:16:26 2016 From: john.r.rose at oracle.com (John Rose) Date: Mon, 16 May 2016 11:16:26 -0700 Subject: Value types questions & comments In-Reply-To: References: <2127B453-99E7-4879-9097-F46BE46DB0C3@oracle.com> Message-ID: <9A1704E9-8FA9-43D2-9238-A3DAACC40641@oracle.com> On May 2, 2016, at 4:34 PM, Kevin Bourrillion wrote: > ? > On Mon, Apr 11, 2016 at 4:13 PM, Brian Goetz > wrote: > > For example, I see no reason why `int` can?t implement Comparable or Serializable ... > > We might even take this further ? by actually describing `int` with a source file (public native class int implements Comparable { ? }) which might try and smooth out some of the differences, but I wouldn?t hold out a lot of hope for this being super successful. > > I now think that neither of these bits will accomplish much. Implemented types only mean anything to the *boxed* form, Matters look better to me on these points than you think. An interface is a promise about a set of methods in the concrete class that implements that interface. This all about the concrete class, and nothing to do with any future use of invokeinterface. This part of interfaces makes them useful even for ret-conning "int" (if we choose to endow "int" with methods, for better regularity). To take it from the other end, suppose interfaces were only about invokeinterface behavior. Even then, behavioral parity between values and their boxes would require *some* sort of access to interface methods on unboxed values. (Maybe it's *implemented* by boxing under the hood?but maybe not.) Finally, since interfaces can carry behavior in the form of default methods (as well as contracts with other types), we can use interface subtyping on values to express polymorphic algorithms that operate (using bounds-polymorphism) on those values. Depending on how we formulate things, such algorithms need not commit the JVM or JLS to perform the algorithms on boxed values. I think the above would accomplish quite a lot; don't you? > but int doesn't even use "that" kind of boxing; it must continue to be boxed to java.lang.Integer as always, and nor is there a way to make Integer "extend Box", which would have bought us a few things, because it has to extend Number (which tragically is not an interface). Parity of Integer with new boxes is a thorny problem. But I don't think we are out of options yet. We might choose to make Integer boxes look more like new new boxes, or vice versa. We might choose to distinguish between user-created Integer instances and system-created ones (reserving better rules for the latter). And there are several other things we might try. > So, I think you're looking at a taxonomy something like this: > > Types > +-- Reference types > +-- Value types > +-- Primitives / "builtin value types" > +-- "Custom value types" > > ... and basically we need to diligently use the term "custom value types" instead of "value types" whenever talking about information that doesn't apply to primitives. It's true that there will be small observable differences between primitives ("legacy value types"!) and "custom value types". So there will be such a taxonomy, but it will be relevant only to the extent that the "small" differences will cost users some of their attention. We hope it will be negligible for most purposes. > But, I think that this will also not work out well, because it belies the many commonalities between kinds 1 and 3 that primitives don't share (they have a source file, they get loaded and initialized, they have methods....). For some users, it won't work out well. I can say from experience that every change ruins the language for someone (at least until they learn to live with it). Our task is to make it work, for most users, better than the previous version of the language. I'm hopeful we can do this. > I'm back once again to the idea that there are just plain three different kinds of types, where value types are largely but not entirely a hybrid of the other two. Now, sure, the more areas (such as specialization) in which we can make primitives and value types work identically the better. The taxonomy which is most interesting to me is the disjoint union between "val" and "ref", which may be called "any". That is, some things are vals, other things are refs, nothing is both a val and a ref, unless it is an "any" which accepts both. The difference between classic type parameters and new ones is one is limited to refs, while the other accepts both vals and refs. Under this fundamental distinction, we will sometimes observe an additional distinction between legacy value types and custom value types. How often that happens depends on how well we do the job at hand. So, whenever we notice a potential difference between legacy value types and custom value types, we should (a) describe it carefully, and (b) consider how to mitigate it in the user experience. E.g., custom types have methods and other named API points. We can mitigate the difference with "int" by imputing API points to primitives. I think the right way to do this is to give primitives some interfaces to implement. This will allow generic algorithms to operate on both legacy and custom value types, an obvious win for numerical or other algebraic algorithms. And so on? ? John Independently of all of the above, and responding to a previous point you made, we are moving parametric polymorphism from the source type system to the runtime type system, by reifying all val bindings of type variables at runtime. This seems necessary in order to extend genericity uniformly to vals, starting with primitives. That introduces a different set of cross-cutting complexities, because now there are runtime subtypes which are not subclasses, just like in the static types of the language. (See JLS 4.10.2, generic type subtyping as affected by type variable containment, 4.5.1.) This is being surfaced reflectively as species, a distinct and necessary refinement of class. Given that a similar kind of subtype relation was already in the language, we can expect users can learn to know what to do when they see one at runtime. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon May 16 18:20:38 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 16 May 2016 14:20:38 -0400 Subject: Value equality (was: Value types questions and comments) In-Reply-To: References: <2127B453-99E7-4879-9097-F46BE46DB0C3@oracle.com>

Message-ID: Kevin say: > However, it would be very sad if it does not solve a second problem at > the same time, because it can. Even Java developers who are content > with the performance foibles of their existing "value-based classes" > are constantly irritated by the burden and bug-prone nature of > writing/maintaining such classes. > We have two choices in front of us regarding equality on value types: 1. What does `==` on values do? Choices include a bitwise comparison, a componentwise comparision (identical to bitwise unless there are float members), or invoking the `equals()` method. 2. What is the default for the `equals()` method? The obvious choice is a componentwise comparision, but Kevin has (strongly) suggested we consider a deep comparison, calling .equals() on any reference members. (Note that the two choices collapse to the same thing when there are no reference members.) For values whose members are value-ish objects themselves (e.g., String, Optional, LocalDateTime), the deep-equals is obviously appealing -- a tuple of (int, date) makes total sense to compare using .equals() on the date. On the other hand, for values whose members are references to mutable things (like lists), it's not as obvious what the right default is. So, Kevin -- please make your case! Please share your experience with tools like AutoValue, and if you can, data on how frequently (and why) equals() is underridden on auto values in the Google codebase? From john.r.rose at oracle.com Mon May 16 19:41:12 2016 From: john.r.rose at oracle.com (John Rose) Date: Mon, 16 May 2016 12:41:12 -0700 Subject: Value types questions & comments In-Reply-To: References: <2127B453-99E7-4879-9097-F46BE46DB0C3@oracle.com>

Message-ID: On Apr 12, 2016, at 1:51 PM, Brian Goetz wrote: > >> >> Using a value type for something that isn't a value raises alarm bells for me. At the minimum I would expect this user to have to implement eq/hc by hand, because the default behavior users want 99% of the time is (deep) content-based equality. > > This may be the reality-distortion field speaking, but in my view a reference *is* a kind of value ? albeit a very special kind. They?re immutable, like other values. To be more precise: Values have no mutable subfields. References and primitives have no mutable subfields. (References, when not null, point to various objects, some of which themselves have mutable subfields.) In this way, they are all similar. Also, *some* value or references (but not any primitives or some references like String) *may* refer to objects which have mutable subfields or some other kind of mutable state. Mainly because of this property, values, primitives, and references are at least partially referentially transparent under copy operations. When you copy a value (of any sort, ref or val), you capture all of its substructure, including any possible future value of its substructure. But the copy is not infinitely deep. (That would be an additional commitment.) The copy only copies the immediate subfields of the value: The whole ref, both halves of the long, all immediate fields of a custom value type. So far a reference is indistinguishable from a one-field value wrapping a reference. The next question is whether a value-wrapped reference should be restricted in its mutability (or in its type, etc.). I think we have to give a firm *no* here; there's too much to lose from making restrictions that are not natural to the computational machinery of the JVM. For example, if (following some well-intentioned commentators) we require that the transitive closure of a value be fully immutable ("like an int, right?") we give up the ability to use values as cursors and tuples. Q: "Why can't I return two values from my method without boxing to the heap?" A: "Well, one of your values was really a reference to mutable state, don't you see." That's not a conversation I am willing to have. Some languages (even on the JVM) might not allow partial immutability, but Java must, IMO. > Almost all their state is encapsulated (they can be compared by identity, that?s it). They can only be constructed by privileged factories (we call these constructors.) But, ultimately, they behave like values ? they are passed by value, they have no identity of their own. ?In any case, values with immutable immediate components lead us to questions about their equivalence relations (equality). There is more than one natural-looking equality predicate that can be assigned to a tuple (and therefore to any value). This means any choice of default has a ready-made rebuttal. 1. Component-wise boxed normal. Equivalent to boxing the tuple into a List