RFR:JDK-8198749 Translation of value constructors in classic constructor notation

Thu Jul 12 00:18:05 UTC 2018

On Jul 11, 2018, at 7:07 AM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:
> 
> On 11/07/18 14:35, Srikanth wrote:
>> 
>> 
>> On Wednesday 11 July 2018 06:38 PM, Maurizio Cimadamore wrote:
>>> Hi Srikanth,
>>> very good piece of work. Some comments below:
>> 
>> [...]
>>> 
>>> - I don't fully get the dance around JCNewClass when you want to create a default value. I think a JCNewClass tree is too heavy for a default value creation. What you need is an AST node with argument expression and a type (or a symbol, if you want). You don't need constructor symbols, I think - I get that this code has probably changed over time from a world with slightly different invariants, but looking at it now feels overly complex.
>> 
>> Agree. This is just a side effect of __MakeDefault piggyacking on JCNewClass AST node with the
>> creationMode field serving as a discriminator - NEW for references and DEFAULT_VALUE for value
>> types.
>> 
>> We don't even need an argument expression - default value creation cannot take any parameters.
> Right - I missed that!
>> Again, agree about the irrelevance of constructor symbol and type there - It is just an attempt to
>> preserve internal consistency after having made many moons ago the decision to overload the JCNewClass tree.
> Sure, I was suspecting that was the case - consider everything I raised as followup-worthy - doesn't need to be addressed Right Now.

I suggest rendering __MakeDefault by using a static variable reference tree.
I think that's very close to the actual implementation:  The JVM creates the
canonical default instance at class preparation time, and thereafter hands
it out to anybody who needs it.  The reference to this canonical default
instance is very likely stored in a manner similar to that of public static final
reference fields.  In any case, the semantics are similar enough to merge
into one kind of tree node.

I agree that JCNewClass is overkill.  It dates from an earlier model of
value type construction, where every value type construction would
always specify all fields, to an omnibus vnew instruction.  The current
model for construction is a defaultvalue followed by a chain of withfields.

>> 
>> I will look at simplifying the AST in a follow up ticket - though with the canonical constructor syntax, the continued need for __MakeDefault would be questionable.
> Well, we have internal AST node - such as LetExpr, so, it's ok to have internal nodes (WithField and Default) for value types too.

I think we will end up with some user-level syntax for referring to a VT's
default value by name.  Something like `` VT.default ``.  It would compile
directly to a defaultvalue instruction.  Perhaps via a tree that strongly
resembles a global constant reference in VT.  Hmm, nice how all that
hangs together…

>>> 
>>> 
>>> - on whether you need a return or not, is the problem that you don't know whether the code is alive or not after the last statement of the constructor has been evaluated? If that's the case, you *could* in principle feed the new method to Flow.aliveAnalyzer, and see whether it completes normally or not. That's the same logic we also use for lambda checking. E.g. you could create a new AliveAnalyzer visitor, run it on the body you have, and then check whether its 'alive' parameter is true or not. That will allow you to generate the body in full, w/o splitting logic between TransValues and Gen (which is of course fine as an interim workaround)
>> 
>> Perfect. Thanks. Will fix.
>> 
>>> 
>>> - shouldn't we add the new factory to the scope of the class? How is even ClassWriter picking it up if it's not in the scope? Ah, I see, you add it into the scope in getValueFactory, and you do it only once because of the init2factory map. I suggest to move the scope entering near to where the factory method is created so that, longer term, we can be guarateed that we won't attempt to add it to the same scope twice.
>>> 
>> Sounds good, will do.
>>> - replacing old constructor body with super call is odd. Maybe we should throw, to make sure nobody is calling that?
>> 
>> I invested some good bit of time in the throw - but discarded it after learning that the verifier rejects the class file if there ever is a new opcode that targets a value <init>
> Ok, sounds good then.

In the longer term, I want to make <init> methods (in their current void-returning
form) valid *only* in object classes, not value classes.  So they will go from
unreachable to non-existent, in value classes.  Maybe we will resurrect them
(in a new VT-returning factory method form) in value classes only.  For that
usage, I'd favor coining a new name, <make>, and work out a way to endow
object classes (and interfaces also) with <make>-named factory methods.
Then <init> become a legacy from the days of new/dup/invokespecial<init>,
but new code shifts to invokestatic<make>, uniformly across all types.
The *only* remaining use of <init> would be to make the handshake from
a subclass constructor to a superclass constructor, in an object class.

>>> 
>>> - don't you need also to override visitApply in case the constructor is calling  an instance method on the value being constructed?
>> There is some differing opinions on this - Brian says reject all uses of this (implicit or explicit) other than chained ctor calls, field read/writes.
> if you can *read* fields, then I'd expect to be able to call methods?

You'd expect that, yes.  But (a) the value as a whole isn't fully constructed
yet, and (b) some of its fields may be DU.  So grabbing the whole value
means implicitly grabbing a DU field, which should be a no-no, even if
grabbing a single field is OK—as long as it is DA, not DU.

Brian's idea is to forbid materializing the value as a whole until the
constructor exits.  Mine is slightly more permissive, to allow access to
the value inside the constructor, as soon as all fields are DA (not DU).
Thus, after the dirty work of filling in fields, if the constructor code needs
to somehow wash its hands, say by logging the new value, it has access
to the value instance as a whole.

I think a value type constructor is nothing more than a factory method
with a funny invocation notation, plus a funny implementation notation
that lets you write the fields, unlike any other kind of method.  The funny
notations are legacies from Java.  ("Codes like a class…")

(Along those lines, I *also* suspect that *value* class constructors should be
allowed to *return a value*, which is the short-circuited result of the constructor.
Thus, if the constructor decides that the client is really asking for a pre-existing
value X, it can just say "return X" and that's what the client will get.  If such a
constructor says plain "return" with no value, or drops off of the end of the block,
then of course the value implied by the current field settings—which must by DA
before the return—is what the client gets.  But I'm not pressing this thought too
hard yet.  I'm concentrating on "codes like class", more than on "look how values
are so different from objects".)

> If we only allowed field writes (but I can understand that might be too restrictive), then I'd understand.

I'm glad you are thinking in terms of "codes like a class" here.

I filed https://bugs.openjdk.java.net/browse/JDK-8205910 so that we could preserve
"codes like a class" by making VBCs "code" more like values.  The ability to call
virtual methods on unfinished objects is a recurrent source of bugs, which it would
be good to close down on as much as we can.

> 
> In any case, this seems more a case of a design discussion, outside the scope of this code review.

Yep.  That hasn't made us shy of talking about it though.  :-)

— John

P.S. I'm wondering if we are almost ready to try "VT.__DefaulatValue" as a syntax,
as opposed to new __DefaultValue VT().  Or even "T.default", which would compile
to [ilff]const_0, aconst_null, or defaultvalue T.