minimal value types proposal

Tue Aug 30 17:56:45 UTC 2016

> On Aug 29, 2016, at 6:04 PM, John Rose <john.r.rose at oracle.com> wrote:
> 
> On Aug 29, 2016, at 4:17 PM, Dan Smith <daniel.smith at oracle.com> wrote:
>> 
>> Get rid of boxes, and you can get rid of interfaces, default methods, automatic conversions, constructors, …
> 
> It's worth thinking about, and Brian has encouraged me to think about it also.
> 
> Boxes (and the other stuff you mention) are so useful that removing them may well cause more trouble than supporting them up front.  Inside the JVM, we need a boxed representation for some data flows (unless we make all data flows radically value-safe up front).

By all means, let your encoding of a Q-typed instance be an object pointer. I'm just talking about the user's interface with the declaration.

> For the user, a boxed representation is needed for basic debuggability.  What does println or JVMTI do unless there's a box?

One option is that JVMTI knows about value types, as it does primitives, and provides a printout of the fields. Or maybe ValueTypeSupport has a debugString operation that does this.

Or we use a naming convention -- value types are expected to provide 'toString', and/or 'box'. (I don't mind the boxes themselves, just the automatic aspect of them.)

> I do like the idea of requiring the user to set up both classes manually, at first.  It has the advantage of making very clear (all too clear) the distinction between the Q-type and the L-type:  No source code defines both; the Val guy would (presumably) disable its L-type so people could not use it.

Yes, that's what I have in mind.

>> 2) Instance methods also add tons of complexity.
> 
> I disagree; I think the incremental complexity is comparable to trying to do everything with statics, which is why I'm recommending this in the minimal model.
> 
> The only invocation paths for instance methods (and instance fields) on Q-types is through method handles.  Method handles treat all arguments (including 'this') symmetrically, so any effort applied to have them work on Q-types *at all* will apply to 'this' parameters for Q-types.

It's a given that you have statics (see your Int128 declaration, for example). If instance methods are practically free after that, fine. But if not, there's no particular reason to support them. We don't have polymorphism (for Q types, anyway -- I'm assuming no automatic boxes, per (1)), and it's just as easy -- easier, in fact -- for a client to do "invokestatic Val.m(QVal;)I" as it is to do "invokedynamic [vinvoke ...]".

> Perhaps you are objecting to the inefficiency of operating on 'this' in the boxed L-type form, when the operation starts as a MH-based invocation of a Q-type?  That's only a startup transient; there are several tactics we can use to remove it.  For example, box elision (already in the JITs, though not value-friendly yet) would remove boxing overheads without requiring any manual recoding at all.

Yeah, this is one of the big things that jumps out at me. Our end goal is to define instance methods with a Q-typed 'this'. Having an intermediate step of instance methods with an L-typed 'this' doesn't seem productive. Yeah, there's some engineering we may want to do anyway to get default methods with L-typed 'this' to be efficient, but I'd prefer to keep that engineering off of the critical path. Write your bytecode with a Q-typed 'this' (or static, with no 'this' at all), and we don't have to hope that the JIT will optimize.

> Convenience and migration cannot be driven to zero; that optimizes for "minimal" at the expense of "viable".  To preserve viability, there are at least a few really basic conventions, like Object.toString, that would have to be re-encoded using such statics.

As I commented above, I'm not opposed to naming conventions that ensure a 'toString' or 'box' method exists. And if you're invoking Object.toString, you're first going to have to box, anyway. It's just as easy to do "invokestatic Val.box(QVal;)LObject;" as it is to do "invokedynamic [asType ...]".

>  Re-building virtuals (at least some of the) on top of statics has its own cost, in wasted motion and confusion.

I'd like to understand this better. You're talking about the confusion involved with training people to invoke static methods, only to tell them later that they can use instance methods, too?

> We can and should work towards real Q-typed 'this'.  The simplest way is what I'm proposing with the method handle hack.  In addition, I suggest experimentally modifying javac to emit two copies of non-static methods in value-capable classes, one with the standard bytecodes, and one as a static (with mangled name) which takes a Q-typed 'this' in local 0.  Then teach the method handle resolver to find these guys and bind them, in preference to the boxed-this dance.  Users can get on with their business, unaware of all of this.

javac doesn't generate value classes at all (at least in the first cut). That aside, yes, any scheme in which a Q-typed 'this' is expressed directly is an improvement in my book.

>> 3) The minimal feature set for basic operations -- field getters, default value, withers, comparison, arrays -- is a class (e.g., ValueTypeSupport) with bootstrap methods that can be called via invokedynamic. No need to touch MethodHandles.Lookup, etc.
> 
> I don't think the cost of touching MH.Lookup is great, especially given that the MH runtime will have to be able to work with Q-types more or less pervasively.  I agree that all the extended lookup functionality could be placed on a new class (alongside findWither etc.), but I don't see any benefit to doing that.

Okay, cool. I suppose my main discomfort is that, if we embed behavior in existing APIs, it's easier to overlook that change later and forget to put the proper design & specification effort into it. Things get baked in just because they're already there. But if we can avoid that problem, great.

>> More generally, why so much attention given to reflection? Sure, you need class objects to represent all the JVM's types. But member lookup? Fields, Methods, Constructors? These do not seem necessary.
> 
> Because method handles are where the functionality comes from; you need basic reflection in order to mention the method handles you want. Bytecode spinning is not enough, since that would require us to invent a full bytecode set and implement it.  The MH runtime is more malleable than the JVM's interpreter, so we are starting with MHs.  Hence the need for MHs.

I'm unfairly lumping two things together.

java.lang.reflect: We need java.lang.Class objects that represent value types. Beyond that, I don't see any point in touching this API. Eventually, sure. Not in the first cut. (For example: Class.getMethod can behave just like it does for primitives, throwing NSME. Or just operating on statics. And we certainly don't need to put any effort into a special-purpose Constructor.newInstance method.)

java.lang.invoke: Accept Q-typed values as inputs/outputs? Yes. Most of the rest of your proposed changes make sense to me, subject to the discussion above (maybe no box/unbox conversions; maybe no instance methods). I wouldn't bother with findConstructor on a Q type.

>> If I squint, I can kind of see how the idea is that somebody might want to write reflective code to operate on values, since they don't have language support.
> 
>> If this is the use case, I think a better use of resources would be to surface Q types in the language.
> 
> Yes, surface them, but don't require a full set of bytecodes to operate on them.  That's the slow way to do it.

Sure, absolutely. I definitely buy into the idea that we need enough support in java.lang.invoke to provide library-defined operations, rather than having to introduce new opcodes.

>> I don't think it's necessary to support Q types as the receivers of CONSTANT_Fieldrefs and CONSTANT_Methodrefs. The receiver can be a vanilla CONSTANT_Class, and the client (in this case, the 'vgetfield' API point) can figure out what to do with the resolved reference.
> 
> Yes, that's one way to go.  But representing Q-types as java.lang.Class objects will be a sunk cost, so passing the L/Q distinction through existing data flows (on "overloaded" API points) is a reasonable design pattern, for a prototype.

My thinking is that, for example, I'd rather not touch method resolution at all. Maybe that saves us some work. (And as I've thought about the ultimate bytecode design, I'm leaning that way as the ultimate solution, meaning maybe less work to undo things later.)

> I also think (in this case) the Lookup API will, in the long term, look something like the current sketch; there won't be a separate Lookup.findValueGetter any more than there is a separate Lookup.findInterfaceVirtual.

Yeah, that makes sense. The inputs to these methods are Class objects, so you'll have the extra type information you need.

—Dan