Value types questions & comments
John Rose
john.r.rose at oracle.com
Mon May 16 18:16:26 UTC 2016
On May 2, 2016, at 4:34 PM, Kevin Bourrillion <kevinb at google.com> wrote:
> …
> On Mon, Apr 11, 2016 at 4:13 PM, Brian Goetz <brian.goetz at oracle.com <mailto:brian.goetz at oracle.com>> wrote:
>
> For example, I see no reason why `int` can’t implement Comparable or Serializable ...
>
> We might even take this further — by actually describing `int` with a source file (public native class int implements Comparable { … }) which might try and smooth out some of the differences, but I wouldn’t hold out a lot of hope for this being super successful.
>
> I now think that neither of these bits will accomplish much. Implemented types only mean anything to the *boxed* form,
Matters look better to me on these points than you think.
An interface is a promise about a set of methods in the concrete class that implements that interface. This all about the concrete class, and nothing to do with any future use of invokeinterface. This part of interfaces makes them useful even for ret-conning "int" (if we choose to endow "int" with methods, for better regularity).
To take it from the other end, suppose interfaces were only about invokeinterface behavior. Even then, behavioral parity between values and their boxes would require *some* sort of access to interface methods on unboxed values. (Maybe it's *implemented* by boxing under the hood—but maybe not.)
Finally, since interfaces can carry behavior in the form of default methods (as well as contracts with other types), we can use interface subtyping on values to express polymorphic algorithms that operate (using bounds-polymorphism) on those values. Depending on how we formulate things, such algorithms need not commit the JVM or JLS to perform the algorithms on boxed values.
I think the above would accomplish quite a lot; don't you?
> but int doesn't even use "that" kind of boxing; it must continue to be boxed to java.lang.Integer as always, and nor is there a way to make Integer <handwave> "extend Box<int>", which would have bought us a few things, because it has to extend Number (which tragically is not an interface).
Parity of Integer with new boxes is a thorny problem. But I don't think we are out of options yet. We might choose to make Integer boxes look more like new new boxes, or vice versa. We might choose to distinguish between user-created Integer instances and system-created ones (reserving better rules for the latter). And there are several other things we might try.
> So, I think you're looking at a taxonomy something like this:
>
> Types
> +-- Reference types
> +-- Value types
> +-- Primitives / "builtin value types"
> +-- "Custom value types"
>
> ... and basically we need to diligently use the term "custom value types" instead of "value types" whenever talking about information that doesn't apply to primitives.
It's true that there will be small observable differences between primitives ("legacy value types"!) and "custom value types". So there will be such a taxonomy, but it will be relevant only to the extent that the "small" differences will cost users some of their attention. We hope it will be negligible for most purposes.
> But, I think that this will also not work out well, because it belies the many commonalities between kinds 1 and 3 that primitives don't share (they have a source file, they get loaded and initialized, they have methods....).
For some users, it won't work out well. I can say from experience that every change ruins the language for someone (at least until they learn to live with it). Our task is to make it work, for most users, better than the previous version of the language. I'm hopeful we can do this.
> I'm back once again to the idea that there are just plain three different kinds of types, where value types are largely but not entirely a hybrid of the other two. Now, sure, the more areas (such as specialization) in which we can make primitives and value types work identically the better.
The taxonomy which is most interesting to me is the disjoint union between "val" and "ref", which may be called "any". That is, some things are vals, other things are refs, nothing is both a val and a ref, unless it is an "any" which accepts both. The difference between classic type parameters <T> and new ones <any T> is one is limited to refs, while the other accepts both vals and refs.
Under this fundamental distinction, we will sometimes observe an additional distinction between legacy value types and custom value types. How often that happens depends on how well we do the job at hand.
So, whenever we notice a potential difference between legacy value types and custom value types, we should (a) describe it carefully, and (b) consider how to mitigate it in the user experience. E.g., custom types have methods and other named API points. We can mitigate the difference with "int" by imputing API points to primitives. I think the right way to do this is to give primitives some interfaces to implement. This will allow generic algorithms to operate on both legacy and custom value types, an obvious win for numerical or other algebraic algorithms. And so on…
— John
Independently of all of the above, and responding to a previous point you made, we are moving parametric polymorphism from the source type system to the runtime type system, by reifying all val bindings of type variables at runtime. This seems necessary in order to extend genericity uniformly to vals, starting with primitives.
That introduces a different set of cross-cutting complexities, because now there are runtime subtypes which are not subclasses, just like in the static types of the language. (See JLS 4.10.2, generic type subtyping as affected by type variable containment, 4.5.1.) This is being surfaced reflectively as species, a distinct and necessary refinement of class.
Given that a similar kind of subtype relation was already in the language, we can expect users can learn to know what to do when they see one at runtime.
More information about the valhalla-spec-observers
mailing list