From john.r.rose at oracle.com Wed May 10 10:20:41 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 10 May 2017 03:20:41 -0700 Subject: Valhalla Minimal Value Types review invitation In-Reply-To: <46245130-70A5-421A-A018-39FF788FB500@oracle.com> References: <5466821F-6EB5-41DA-8AFC-8AD8B563E89E@oracle.com> <46245130-70A5-421A-A018-39FF788FB500@oracle.com> Message-ID: (Note to Oracle people: This is a duplicate of my message on an internal list!) I have rolled most of the effect of these comments into the Shady doc also. (Below I say that vunbox doesn't belong on the same list with vdefault, but I changed the presentation again in the Shady doc, of vunbox. It might trigger DVT derivation, just like vdefault might.) On Apr 26, 2017, at 7:47 AM, Karen Kinnear wrote: > >> Next meeting: Wednesday April 26, 9am PT: >> >> NEW DIAL-IN: https://oracle.zoom.us/j/251372518 > Rough initial cut at load/link/init proposal - one potential topic for today?s agenda. > > MVT Assumptions: > VCC can not have a nullary constructor. > DVT does not have a nor an method. DVT has no code at all. To pass verification maybe it has a nullary constructor, but that constructor can be empty and the JVM will swallow it. Actually, it could *throw an exception*, if there is any chance that random user code could make a call to it, although that is unlikely. (If we ever have a DVT node in the heap, it needs to be created using privileged operations, not a user-written "new DVT" or "DVT.class.newInstance()".) We want to keep the DVT, in its "L-type" form, under the woodwork as much as possible. > > Behavior Goals for contained value types for load, link, init: > > I. Resolution of a VCC or DVT, i.e. classfile contains an LFoo or QFoo: > Resolve a VCC: (LFoo) > 1. load VCC > annotation based: derive DVT class with an internal name - eagerly load DVT As Dan points out, it the DVT derivation could be delayed until the first resolution of the DVT per se. But since they eventually will one class, this question is moot: There will be only one class to load/link/init. How closely do we model this with our pairs DVT/VCC of classes in MVT? I don't have a strong opinion: We could treat them as separate (although one is half-invisible), or we could try to synchronize their bootstrap process as much as possible, to simulate a single class using a pair of closely coupled classes. Either is fine for now, and I would even tolerate simplifications in the JVM and spec. which led to distinct behaviors: At worst it is a bug (spec. vs. impl.) in a temporary prototype. > 2. link VCC > does not trigger linking of DVT +1 > 3. initialization of VCC > triggered by: new, static bytecodes > does not trigger initialization of DVT +1 (in any case DVT is missing, so init is a nop) > > Resolve a DVT: (QFoo) > 1. load DVT > first load VCC, which derives and loads DVT Or: First load VCC (as if it were a superclass, which is kind of true), and then "load" the DVT by deriving it from the loaded VCC. That's the way I prefer to think about it, as long as they are separate. (Alternatively, loading one loads the other; you can't do them separately.) > 2. link DVT > first link VCC Again, it's as if the VCC were a "super" of the DVT. (Just as the JVM loads supers before subs, it *also* links supers before subs.) Or, again, just say that they are always linked together, as if they were one class. > > 3. initialize DVT > first initialize VCC > what triggers initialization of DVT? > normally: new, static bytecodes - these are invalid for DVT > vdefault > vunbox > anewarray/multianewarray on a DVT element type The vunbox call does not trigger initialization in the final system, since there is only one class present, and the value fed to the vunbox op is already evidence of initialization. In the MVT world, the DVT has no , so again we are free to dispense with initialization. Bottom line: "vunbox" doesn't seem to belong on the list with vdefault and newarray-of-value (and eventual getstatic/putstatic/invokestatic). Same argument for "vbox", in the other direction. > > Open for Discussion: > The proposal is that you must not only load a DVT element of an array, you must also link and > initialize the DVT element. Yes. We cannot have values running around on the JVM stack or heap until *after* the value class (DVT or eventual full VT) is loaded. It would be a disaster to try to process values of some type "Foo" before we have decided what is the size and layout of Foo instances. (It's easier with object reference types, since null is always a valid reference value of any type, including an unlinked type, or even an unloaded type.) A while ago we decided to load embedded value types when loading the containing object or value class. It is as if the embedded values are a kind of "super" to the embedding. What is common to both supers and embedded values is you cannot size and lay out the container until those prior dependencies are sized and laid out. As for linkage, that does not (AFAIK) contribute to the layout of the value types. What linkage contributes is the "vetting" of method structure (verification and override analysis). If we were to allow values to run around on the JVM stack before linkage, we would know how big they are and what parts they have but we would not know if they had valid methods we could call on them. This edge case is clearly wrong enough to exclude completely. Finally, as for initialization, what the contributes is the static state that methods inside the value are assuming is true. Again, but more subtly, if you allow values to run around before initialization is complete, the methods can fail if they assume that static state is correctly spun up. (While the code is running, there are necessarily some incomplete states potentially exposed, but only for a short time and confined to one thread.) Bottom line: +100 on this invariant: If a value-type (DVT) object is anywhere on the JVM stack (or in locals), then either (1) the value type class is fully initialized (the VCC, in MVT), or (2) the value type class is in the process of being initialized, and the value type occurrence is in a stack frame in the same thread as is running the . The various rules about arrays and vdefault (and get/put/invokestatic) prevent values from leaking onto the JVM stack without enforcing that invariant. > Otherwise you would need to link and initialize the DVT element on the first vaload, > in case you did not perform a prior vastore. Yep. Loading an uninitialized element of a value array is indistinguishable from doing vdefault. (This is one readl why vdefault is not as privileged as vwithfield.) > The verifier could ensure that you perform a prior vastore, in which case you would only > need to load the DVT element of an array, not link and initialize it. (I don't believe this. The verifier cannot possibly track separately type-states for heap variables, and especially not distinct elements of one array.) In any case, if we don't push element-type initialization into array creation, we must "poll" for it when loading elements from the array, which will add useless expense to that (very common) operation. Again, when you don't have "nulls" as a sort of loose glue to tie things together, you have to be careful about containers and embedded values. Specifically: Most of what we say about array elements of value types is going to apply also to fields of value types, and vice versa. When you create a blank object which contains values, you should already have run the initializers of those value types to completion. ?Sort of as if the value types were supers of the object type containing those fields. Going back to arrays: It is as if the value-type element of an array is sort of like a super to that array type. It has to be initialized before you can use an instance of the array. (And this "generalized super" mentality gives a framework for dealing with vicious cycles: We must detect and reject dependency cycles through value-type components, in the load phase, just as we do with regular supers today.) > II. Instance creation of a DVT, DVT has no > Creation of a default value type instance: which is all 0s in memory to represent > the 0 or 0.0 or null value for fields of the DVT Yes. (Can we get away with no ? Yay, that's the best!) > Triggered by: > 1) vdefault > 2) anewarray/multianewarray on a derived value type > - creates a value array which is all 0s in memory representing the flattened > elements of the array > - which does not entail invoking any constructor on the VCC > 3) vunbox (Also if the value occurs as a field. I just remembered that this is temporarily excluded in MVT, which is fine. I'd put it on the list anyway, with an asterisk saying "we don't do this yet but here's how it would be treated if we did".) > 4) internal implementation details such as copying a DVT - all of which imply that > the DVT is already initialized (So #4 is not really a trigger; we could have a separate list of non-triggering operations, notably by-value copy from any place to any other place.) > It is required that the DVT is in the initialized state prior to the creation of a > default value type instance. Yes. And we remind ourselves (by making the above lists) that creating a default value type instance can happen explicitly via a value-bearing bytecode (vdefault), or implicitly as part of creating an object (array only in MVT) that contains a variable of that value type. > > III. "uninitialized" value type/ partially initialized value type > There is no such thing as an uninitialized value type. > vdefault and anewarray/multianewarray can be invoked from anywhere. +1 > QUESTION: vwithfield: is this restricted to invocation within the DVT? > (For MVT, this would also be within the VCC) Yes, please. Also privileged code, so said code can act responsibly on behalf of the VCC/DVT. > That would allow the wither or instance factory to decide whether > a partially initialized value type would be returned if there were > an exception. Precisely. That's the user model for full value types as well. > > IV. DVT in a Container: other object, other DVT > Class contains a DVT field > - QUESTION: is this supported via bytecodes? I think we said it's OK not to support this except for arrays. But if it's easy to do, we should do it. In any case, we need to keep this case in mind, put on the asterisk that says "in the next version", if we decide not to deliver it. > > For MVT, since we do not flatten DVT fields in objects or in other DVTs, then we > do not require preloading of DVT classes used to define fields. Yes; that's the asterisk. But as soon as we allow "QFoo;" to occur in a classfile field definition (even if javac didn't generate it) then we take off the asterisks. Thanks for laying this out, Karen. To recap: Let's lean on the concept of "generalized supers", where a class (or array type) can have the following dependencies which are all treated on a similar footing: - any class depends on its super class - any class depends on its implemented interfaces - any class with embedded value-type fields (or array elements) depends on their types - the DVT (Q-type projection of VCC) depends on the VCC (principal L-type class) For X in {load,link,initialize}, before a class can be ${X}ed, it must first ${X} each of its "super-like" dependencies. If you buy all that, then I think the only thing left to do is force vdefault to trigger initialization of the DVT. The reason vdefault is a special case is that it creates a value type value out of thin air, rather than loading it from memory. When you load a value type from memory, you can rely on the above load/link/initialize rules to have spun up the value properly. If we make other bytecodes that create values out of thin air, they will have to trigger initialization like vdefault does. (I'm thinking vaguely of a2b type instructions, but probably it can't happen.) The unbox instruction has to "spin up" the DVT, but since it takes the VCC as input, the only action left to do is initialize the DVT, and since the DVT is a pure projection from the VCC, with none of its own baggage (no ) then we are free to opine either way, and in the one-class world the problem will be moot. I'm inclined to say that initializing the VCC automatically, implicitly initializes the DVT also. That will be true in the one-class world also, where initializing a class is all the initialization you need for any of its projections. Also note that if you work *only* with the VCC, the above rules do not imply spinning up the DVT (unless you say it was done invisibly). When we go to the one-class world, we can make the projections depend (in the "super-like" manner) on the principal types. (There are choices there we don't need to make yet, mainly deciding who or what is really principal.) (Specialization may have non-trivial projection initialization. If projections have their own s, then operations which form *those* projections *will* require initialization triggers.) ? John From john.r.rose at oracle.com Wed May 10 11:32:26 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 10 May 2017 04:32:26 -0700 Subject: updates for May 10 meeting In-Reply-To: <4C018C64-E773-499C-831D-804529B8EB88@oracle.com> References: <5466821F-6EB5-41DA-8AFC-8AD8B563E89E@oracle.com> <46245130-70A5-421A-A018-39FF788FB500@oracle.com> <4C018C64-E773-499C-831D-804529B8EB88@oracle.com> Message-ID: <64269954-EE4A-4D49-B844-69EEE3D8837E@oracle.com> Responses to AIs in Karen's previous minutes are embedded. On Apr 26, 2017, at 12:49 PM, Karen Kinnear wrote: > > Next meeting May 10th: dial-in: https://oracle.zoom.us/j/251372518 > > Notes from April 26: > Attendees: Remi, Dan H, Bjorn, Doug, John, Vladimir, Frederic, Karen > > AIs: > 1. John - please update Shady Updated, backlog is a couple of weeks of comments. > 2. All - please review email rough proposal for load/link/init behavior for MVT Response sent. > John: Even if we moved to typed bytecodes today, we would still expect changes > note: values are inherently polymorphic post MVT - e.g. interface support > looking at a 6th BasicType. Interpreter will need to have full type information with the type, Firmed up this position in Shady, a little. I note that in our prototype the vfoo bytecodes (Q-mode instructions) are mercifully free of references to their types. I want to continue this. This means that under the hood there is a true polymorphic "Q-buffer" type that all these guys operate on. This type is Q-Object or "Qjava/lang/__Value;" or some such. I think this type is too useful to keep in the JVM; it will show up in the future as the bound type for polymorphic any-vars, and should be used right now to erase lambda-forms. > > 3. Karen: Do we have a need for a QFoo in a field descriptor? > We support QFoo in a method descriptor as well as [QFoo in a field descriptor. > Concern: reflection and tooling exposure to value types > Lesser concern: implementation complexity > > Responses: > Remi: If play with optional, want to store in a field (note: not supported today) > Remi: monad - not want to pay abstraction > > Frederic: workaround is to use an array of 1 element > given we are not flattening, there are no density or lack of indirection benefits of supporting > Doug: Ask Scala folks or Martin Hadusky (sp?) > John: propose as side project - i.e. target of opportunity, but not planned for MVT > Bjorn, Karen: have experimented with it, ok as side project Shady waffles on this. Still needs work to clarify. > > 4. new root type for values > We needed this internally for LambdaForms to avoid specializing per value type. > Initially we are calling this java/lang/__Value > At this point this is not a global root, but a root for value types only > Exposing this so that MethodHandles can use this is a big step. Little discussion of this in Shady. Some discussion of top types as U-types. I owe a separate position paper on carrier types and {U,Q,L}-{Object,interface,class}. > 5. question from last tim about verifier checking for complete instances of values before letting them go. > John's answer: No - verification is not needed here. See next update of Shady. > > [ed note: see load/link/init email proposal - I think we are in agreement that there is no such thing > as an uninitialized value type] Updated in Shady. http://cr.openjdk.java.net/~jrose/values/shady-values.html http://cr.openjdk.java.net/~jrose/values/shady-values.md Summary of changes: - make relations between VCC and DVT much more definite - VCC is just a POJO; DVT derivation is decoupled from VCC loading - push DVT down beneath the woodwork: it has no separate name, mainly a view of the VCC - ditch the much-unloved ";QFoo;" syntax for CONSTANT_Class (use context instead) - link to more focused draft JEP on CONSTANT_Dynamic (forthcoming) - distinguish between primary (proper) mirrors and secondary (improper) mirrors - discuss DVT initialization and its triggers - clarify that the JVM does not try to enforce complete initialization of values - update bytecode descriptions to better match Valhalla prototype - simplify reflection: no sourceClass, VCC does everything as a POJO - reminder: all this will change - under "more bytecodes" add guidance on carriers and U-types ? John From forax at univ-mlv.fr Wed May 10 16:49:46 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 10 May 2017 16:49:46 +0000 Subject: Valhalla Minimal Value Types review invitation In-Reply-To: References: <5466821F-6EB5-41DA-8AFC-8AD8B563E89E@oracle.com> <46245130-70A5-421A-A018-39FF788FB500@oracle.com> Message-ID: Sorry, I was thinking to be able to join the discussion during the connection in between two flights but the first one was delayed. Reading the new version of shady in the plane, i still do not see why a Qtype is needed, it seen that the VM can infer if a Qtype or a Ltype should be used. R?mi On May 10, 2017 12:20:41 PM GMT+02:00, John Rose wrote: >(Note to Oracle people: This is a duplicate of my message on an >internal list!) > >I have rolled most of the effect of these comments into the Shady doc >also. > >(Below I say that vunbox doesn't belong on the same list with vdefault, >but I changed the presentation again in the Shady doc, of vunbox. >It might trigger DVT derivation, just like vdefault might.) > >On Apr 26, 2017, at 7:47 AM, Karen Kinnear >wrote: >> >>> Next meeting: Wednesday April 26, 9am PT: >>> >>> NEW DIAL-IN: https://oracle.zoom.us/j/251372518 > >> Rough initial cut at load/link/init proposal - one potential topic >for today?s agenda. >> >> MVT Assumptions: >> VCC can not have a nullary constructor. >> DVT does not have a nor an method. > >DVT has no code at all. To pass verification maybe it has a nullary >constructor, >but that constructor can be empty and the JVM will swallow it. >Actually, it could >*throw an exception*, if there is any chance that random user code >could make >a call to it, although that is unlikely. (If we ever have a DVT node >in the >heap, it needs to be created using privileged operations, not a >user-written >"new DVT" or "DVT.class.newInstance()".) We want to keep the DVT, in >its >"L-type" form, under the woodwork as much as possible. > >> >> Behavior Goals for contained value types for load, link, init: >> >> I. Resolution of a VCC or DVT, i.e. classfile contains an LFoo or >QFoo: >> Resolve a VCC: (LFoo) >> 1. load VCC >> annotation based: derive DVT class with an internal name - >eagerly load DVT > >As Dan points out, it the DVT derivation could be delayed >until the first resolution of the DVT per se. But since they >eventually will one class, this question is moot: There will >be only one class to load/link/init. How closely do we >model this with our pairs DVT/VCC of classes in MVT? >I don't have a strong opinion: We could treat them as >separate (although one is half-invisible), or we could >try to synchronize their bootstrap process as much as >possible, to simulate a single class using a pair of >closely coupled classes. Either is fine for now, and >I would even tolerate simplifications in the JVM and >spec. which led to distinct behaviors: At worst it is >a bug (spec. vs. impl.) in a temporary prototype. > >> 2. link VCC >> does not trigger linking of DVT > >+1 > >> 3. initialization of VCC >> triggered by: new, static bytecodes >> does not trigger initialization of DVT > >+1 (in any case DVT is missing, so init is a nop) > >> >> Resolve a DVT: (QFoo) >> 1. load DVT >> first load VCC, which derives and loads DVT > >Or: First load VCC (as if it were a superclass, which is kind of >true), >and then "load" the DVT by deriving it from the loaded VCC. >That's the way I prefer to think about it, as long as they are >separate. >(Alternatively, loading one loads the other; you can't do them >separately.) > >> 2. link DVT >> first link VCC > >Again, it's as if the VCC were a "super" of the DVT. >(Just as the JVM loads supers before subs, it *also* links >supers before subs.) Or, again, just say that they are >always linked together, as if they were one class. > >> >> 3. initialize DVT >> first initialize VCC >> what triggers initialization of DVT? >> normally: new, static bytecodes - these are invalid for >DVT >> vdefault >> vunbox >> anewarray/multianewarray on a DVT element type > >The vunbox call does not trigger initialization in the final system, >since there is only one class present, and the value fed to the vunbox >op is already evidence of initialization. In the MVT world, the DVT >has no , so again we are free to dispense with initialization. > >Bottom line: "vunbox" doesn't seem to belong on the list with vdefault >and newarray-of-value (and eventual getstatic/putstatic/invokestatic). > >Same argument for "vbox", in the other direction. > >> >> Open for Discussion: >> The proposal is that you must not only load a DVT element of an >array, you must also link and >> initialize the DVT element. > >Yes. We cannot have values running around on the JVM stack or heap >until *after* the value class (DVT or eventual full VT) is loaded. It >would be >a disaster to try to process values of some type "Foo" before we have >decided what is the size and layout of Foo instances. (It's easier >with >object reference types, since null is always a valid reference value of >any type, including an unlinked type, or even an unloaded type.) > >A while ago we decided to load embedded value types when >loading the containing object or value class. It is as if the >embedded values are a kind of "super" to the embedding. What >is common to both supers and embedded values is you cannot >size and lay out the container until those prior dependencies are >sized and laid out. > >As for linkage, that does not (AFAIK) contribute to the layout of the >value types. What linkage contributes is the "vetting" of method >structure (verification and override analysis). If we were to allow >values to run around on the JVM stack before linkage, we would >know how big they are and what parts they have but we would >not know if they had valid methods we could call on them. This >edge case is clearly wrong enough to exclude completely. > >Finally, as for initialization, what the contributes is the static >state >that methods inside the value are assuming is true. Again, but >more subtly, if you allow values to run around before initialization >is complete, the methods can fail if they assume that static state >is correctly spun up. (While the code is running, there are >necessarily some incomplete states potentially exposed, but only >for a short time and confined to one thread.) > >Bottom line: +100 on this invariant: If a value-type (DVT) object >is anywhere on the JVM stack (or in locals), then either (1) the value >type class is fully initialized (the VCC, in MVT), or (2) the value >type >class is in the process of being initialized, and the value type >occurrence >is in a stack frame in the same thread as is running the . > >The various rules about arrays and vdefault (and get/put/invokestatic) >prevent values from leaking onto the JVM stack without enforcing that >invariant. > >> Otherwise you would need to link and initialize the DVT element on >the first vaload, >> in case you did not perform a prior vastore. > >Yep. Loading an uninitialized element of a value array is >indistinguishable >from doing vdefault. (This is one readl why vdefault is not as >privileged as >vwithfield.) > >> The verifier could ensure that you perform a prior vastore, in >which case you would only >> need to load the DVT element of an array, not link and initialize >it. > >(I don't believe this. The verifier cannot possibly track separately >type-states >for heap variables, and especially not distinct elements of one array.) > >In any case, if we don't push element-type initialization into array >creation, >we must "poll" for it when loading elements from the array, which will >add >useless expense to that (very common) operation. Again, when you don't >have "nulls" as a sort of loose glue to tie things together, you have >to be >careful about containers and embedded values. > >Specifically: Most of what we say about array elements of value types >is >going to apply also to fields of value types, and vice versa. When you >create a blank object which contains values, you should already have >run the initializers of those value types to completion. ?Sort of as >if >the value types were supers of the object type containing those fields. > >Going back to arrays: It is as if the value-type element of an array >is >sort of like a super to that array type. It has to be initialized >before >you can use an instance of the array. > >(And this "generalized super" mentality gives a framework for dealing >with vicious >cycles: We must detect and reject dependency cycles through value-type >components, >in the load phase, just as we do with regular supers today.) > >> II. Instance creation of a DVT, DVT has no >> Creation of a default value type instance: which is all 0s in memory >to represent >> the 0 or 0.0 or null value for fields of the DVT > >Yes. (Can we get away with no ? Yay, that's the best!) > >> Triggered by: >> 1) vdefault >> 2) anewarray/multianewarray on a derived value type >> - creates a value array which is all 0s in memory representing >the flattened >> elements of the array >> - which does not entail invoking any constructor on the VCC >> 3) vunbox > >(Also if the value occurs as a field. I just remembered that this is >temporarily >excluded in MVT, which is fine. I'd put it on the list anyway, with an >asterisk >saying "we don't do this yet but here's how it would be treated if we >did".) > >> 4) internal implementation details such as copying a DVT - all of >which imply that >> the DVT is already initialized > >(So #4 is not really a trigger; we could have a separate list of >non-triggering >operations, notably by-value copy from any place to any other place.) > >> It is required that the DVT is in the initialized state prior to the >creation of a >> default value type instance. > >Yes. And we remind ourselves (by making the above lists) that creating >a default value type instance can happen explicitly via a value-bearing >bytecode (vdefault), or implicitly as part of creating an object (array >only >in MVT) that contains a variable of that value type. > >> >> III. "uninitialized" value type/ partially initialized value type >> There is no such thing as an uninitialized value type. >> vdefault and anewarray/multianewarray can be invoked from anywhere. > >+1 > >> QUESTION: vwithfield: is this restricted to invocation within the >DVT? >> (For MVT, this would also be within the VCC) > >Yes, please. Also privileged code, so said code can act responsibly >on behalf of the VCC/DVT. > >> That would allow the wither or instance factory to decide whether >> a partially initialized value type would be returned if there were >> an exception. > >Precisely. That's the user model for full value types as well. > >> >> IV. DVT in a Container: other object, other DVT >> Class contains a DVT field >> - QUESTION: is this supported via bytecodes? > >I think we said it's OK not to support this except for arrays. >But if it's easy to do, we should do it. In any case, we need >to keep this case in mind, put on the asterisk that says >"in the next version", if we decide not to deliver it. > >> >> For MVT, since we do not flatten DVT fields in objects or in other >DVTs, then we >> do not require preloading of DVT classes used to define fields. > >Yes; that's the asterisk. But as soon as we allow "QFoo;" to occur in >a classfile field definition (even if javac didn't generate it) then we >take >off the asterisks. > >Thanks for laying this out, Karen. > >To recap: Let's lean on the concept of "generalized supers", where >a class (or array type) can have the following dependencies which >are all treated on a similar footing: > - any class depends on its super class > - any class depends on its implemented interfaces >- any class with embedded value-type fields (or array elements) depends >on their types >- the DVT (Q-type projection of VCC) depends on the VCC (principal >L-type class) > >For X in {load,link,initialize}, before a class can be ${X}ed, >it must first ${X} each of its "super-like" dependencies. > >If you buy all that, then I think the only thing left to do is force >vdefault >to trigger initialization of the DVT. The reason vdefault is a special >case >is that it creates a value type value out of thin air, rather than >loading it >from memory. When you load a value type from memory, you can rely >on the above load/link/initialize rules to have spun up the value >properly. >If we make other bytecodes that create values out of thin air, they >will >have to trigger initialization like vdefault does. (I'm thinking >vaguely >of a2b type instructions, but probably it can't happen.) > >The unbox instruction has to "spin up" the DVT, but since it takes the >VCC as input, the only action left to do is initialize the DVT, and >since the DVT is a pure projection from the VCC, with none of its >own baggage (no ) then we are free to opine either way, >and in the one-class world the problem will be moot. I'm inclined >to say that initializing the VCC automatically, implicitly initializes >the DVT also. That will be true in the one-class world also, where >initializing a class is all the initialization you need for any of its >projections. > >Also note that if you work *only* with the VCC, the above rules do not >imply >spinning up the DVT (unless you say it was done invisibly). > >When we go to the one-class world, we can make the projections depend >(in the >"super-like" manner) on the principal types. (There are choices there >we don't >need to make yet, mainly deciding who or what is really principal.) > >(Specialization may have non-trivial projection initialization. If >projections >have their own s, then operations which form *those* >projections *will* >require initialization triggers.) > >? John -- Sent from my Android device with K-9 Mail. Please excuse my brevity. From karen.kinnear at oracle.com Mon May 22 15:36:08 2017 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Mon, 22 May 2017 11:36:08 -0400 Subject: Minimal Value Types notes May 10, 2017 Message-ID: Notes from Minimal Value Types review May 10, 2017 attendees: Bjorn, John, Maurizio, Frederic, Vladimir, Karen AIs: All - please review Shady 0.4 update from John - http://cr.openjdk.java.net/~jrose/values/shady-values.html John - create JEP for Minimal Value Types, linked to Shady VM implementators: check impact of removing CONSTANT_Class ?;QFoo? and necessity of adding vmultianewarray instead of overloading anewarray and multianewarray 1. Summary of updates from John: (reordered to match conversation) 1a. VCC/DVT relationship: - make relations between VCC and DVT much more definite - VCC is just a POJO; DVT derivation is decoupled from VCC loading - push DVT down beneath the woodwork: it has no separate name, mainly a view of the VCC - distinguish between primary (proper) mirrors and secondary (improper) mirrors - simplify reflection: no sourceClass, VCC does everything as a POJO note: reflection only supported on the VCC New concept of a primary mirror (VCC) and a secondary mirror (DVT), which could help in future with specialization views. Not clear where this is exposed. 1b. - ditch the much-unloved ";QFoo;" syntax for CONSTANT_Class (use context instead) - link to more focused draft JEP on CONSTANT_Dynamic (forthcoming JEP in progress in a few weeks) Editor note: to clarify timing of CONSTANT_Dynamic (Condy) - certainly not prior to Early Access of MVT Meeting explored in detail requirements for verification on byte codes vs. constant pool ldc - this always gets a CONSTANT_Class, so always gets the primary mirror - if you want the secondary mirror - not allowed for Early Access - option 2: investigate if ldc with CONSTANT_Dynamic works ok for MVT note: piggybacking on BootStrapMethod - one concern: BSM allows up to 65,000 bootstrap args, today limited to 251 - e.g. with matching switch - if each case needs a different arg and we use Condy, will need a way to support > 251 cases vgetfield vs. overload of getfield - leave as vgetfield since verifier needs to know the type of the receiver without a constant pool change anewarray, multianewarray - currently overloaded byte codes dynamically checking type of element. Verifier needs either constant pool or specific byte code to distinguish. Meeting proposal - add a new byte code - vmultianewarray (see AIs to double-check verifier requirements and update Shady) ed. note: internal explorations of verifier changes without constant pool tagged as a value type is bringing up issues, so this will need further discussion. - discuss DVT initialization and its triggers (see below for load/link/init review) - clarify that the JVM does not try to enforce complete initialization of values - update bytecode descriptions to better match Valhalla prototype - reminder: all this will change Meeting also explored vdefault, vwithfield restrictions: vdefault and vmultianewarray are unrestricted vwithfield restricted to VCC - this implies that creation of a value type other than default must use a MethodHandle wither note from John about vwithfield byte code allowing nestmates, e.g. if you have a Lookup object with private access mode you could see private members. 2. under "more bytecodes? Shady added guidance on carriers and U-types Exploring ?sorts?, e.g. extend IJFDL to add Q and U types Type might be a QFoo or an LFoo note: if LFoo, preserves identity, but does not assume identity note: some instructions deal with references or null, some do/will not John: we may be able to get optimizations from having the same payload field alignment and layout for a UType whether it is a LFoo or a QFoo 3. review of load/link/init proposal from Karen Concepts: Think of embedded value type fields as pre-loaded classes, analogous to super type handling today Think of VCC as a pre-loaded class for the DVT. We discuss linking and initialization of the DVT here so that the class can maintain class state requirements, even though at this time the DVT has no static fields and no methods including no or . Resolution of a VCC (LFoo) load VCC - today we eagerly derive the DVT class based on the annotation link VCC: no impact on DVT initialization of VCC: no impact on DVT triggers for VCC initialization: new, static byte codes Resolution of a DVT (QFoo) resolve the VCC, i.e. lookup Foo, classloader and load VCC if not already loaded link DVT: first link VCC initialize DVT: first initialize VCC triggers for DVT initialization: vdefault, vunbox, vmultianewarray Agreed: For an array with a DVT component - you must load, link and init the DVT You do not however need to create a DVT instance - the array can be filled with 0?s since we now know the element size. Vaload must create a DVT instance and can count on Uninitialized and partially initialized DVT: No such thing as 3a. Request from Dan Smith (JVMS) - Allow lazy DVT derivation and loading - i.e. specify at first use of the DVT and load before linking, rather than at VCC load time - note: MVT may grant leeway here since this will be a non-issue long-term when there is only one classfile 3b. ed. note - John?s review notes comment about having a value type field embedded in an object or value type. I explicitly took those out of the initial proposal to simplify it. Follow-on note in case we were to add this back Object or DVT with an embedded DVT instance field: embedded DVT (including any nested embedded DVTs) are included in the pre-loaded classes list so we can correctly handle field layout Load of container: pre-load embedded DVT Link of container: pre-link embedded DVT Init of container: pre-init embedded DVT Meeting note: embedded DVT in a static field: - must be loaded at preparation time, so prior to linking the container - this is explicitly different than the instance field, to avoid potential circularity errors 4. exposure of secondary mirror? Need to investigate how much of java.lang.Reflect API we can avoid for MVT Note: value type has NO methods. So you can not perform getClass on the secondary mirror returned by ValueType.forClass. You have to box to invoke getClass, and therefore you always get the primary mirror for the VCC. Maurizio: do we need an is_value predicate? And might we want to make this easier for the client if for example reflection APIs such as newInstance were to throw an exception if given a secondary mirror Explore what might be a appropriate for MVT for reflection to provide. Proposed that Early Access throw errors to get feedback. (e.g. do not allow getting the superclass of a value type) p.s. correction on earlier minutes - Doug?s reference was probably to Scala?s Martin Odersky From dl at cs.oswego.edu Wed May 24 12:03:53 2017 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 24 May 2017 08:03:53 -0400 Subject: Minimal Value Types and VarHandles Message-ID: <6c6faa2b-0a82-2631-a8b0-5cdefc5e5dcc@cs.oswego.edu> Even the most minimal versions of Minimal Value Types need to work with VarHandles. We should check if anything about this would force adjustments in VarHandle spec that might be possible before jdk9 ships. Current version of spec is at: http://download.java.net/java/jdk9/docs/api/java/lang/invoke/VarHandle.html After walking through the issues, I think specs are (barely) OK as there are. But further sanity checks would be welcome. First, there's nothing in the spec that says that VarHandles for value fields of other objects/arrays must be allowed to be constructed, but it would be unexpected and hostile not to allow them. Second, even if VarHandles can be constructed, most (but not all) methods are allowed to throw UnsupportedOperationException. This includes all non-Plain mode accesses (setAcquire, getVolatile, compareAndSet, and so on). But that would also be unexpected and hostile. Notice that atomic VarHandle method specs are independent of whether fields are declared as "volatile". So, the VM does not necessarily get any advance warning that a value field/element may be accessed atomically. This was a pragmatically forced decision because array elements cannot be syntactically declared as volatile. It is good practice to do so in other cases, but not enforced. Although because of looseness of VarHandle specs, it seems legal to throw UnsupportedOperationException if not on a volatile field (thus, always for array elements). But again, this would be unexpected and hostile. If the decision is made to always allow atomic operations (which seems most desirable), there a few implementation choices, that might differ across value types, platforms, and JVMs. Options that appear to be legal wrt the current VarHandle spec include: 1. For small (<= 64bit) types, mapping to existing scalar intrinsics. 2. Wrapping V operations within a possibly-global lock. Mote that it is not possible in general to use the builtin "synchronized" monitor lock of the enclosing object, because that could interfere with other uses, leading to liveness failures. However, implementations could use other non-global schemes, for example address-range or hashed locks (almost equivalent to displaced headers). Of course, locking an entire large array to access one element would disable parallelism for array processing. One might expect users to notice this :-) 3. Using transactional memory (on recent x86 and Power), and/or a partial emulation of it using variants of (2). It might be possible to further refine such techniques to cover nested composite values (for example Polygons composed of Lines composed of Points). Some techniques for doing so for IBM packed objects were explored in a few papers including http://dl.acm.org/citation.cfm?id=2972213 Multi-tier Data Synchronization Based on an Optimized Concurrent Linked-list. Bing Yang et al, PPPJ 2017). -Doug From john.r.rose at oracle.com Wed May 24 19:16:52 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 24 May 2017 12:16:52 -0700 Subject: cleaning up the CP in JDK 999 (and thinking clearly about the CP in JDK 10/11) Message-ID: <28301C73-218C-4FA6-B214-071741C31761@oracle.com> This message needs an "Impractical Content" warning, but I want to log an interesting line of thought raised today in our Valhalla meeting with IBM. It actually is practical to think about, as a mental model. We could, if we chose to do it at some far-future date, clarify the roles of CP entries by splitting them as thoroughly as possible into functionally distinct types. Here's the idea: Revamp CP to clearly distinguish (a) artifact references from (b) usage requests. Artifact references are named class-files and named parts thereof: Class[ref], Fieldref, Methodref. Usage requests are specific operations which may be performed on those named entities: InvokeStatic, InvokeSpecial, InvokeVirtual, InvokeInterface, InvokeOnValue, GetField, GetStatic, PutField, PutStatic, WithField. Linking an artifact reference loads the artifact, which implements some usage requests but not all. Linking a usage request verifies that the usage is well-formed and caches (on the usage CP entry, not the artifact CP entry) whatever bits help the instruction go fast after linkage. (Indy/Condy don't work directly on named artifacts, so they are off to the side.) It's safe to say we will never do all of this, but it helps, I think, as a mental framework when considering all the funky overloading inside today's CP. (But, those request types do look a lot like MethodHandle constants. Funny?) The worst overloading in the CP is the need to store double resolution information on an [Interface]Methodref in case it has to handle both invokespecial and invoke[virtual,interface]. But there are also lots of little status bits to support dynamic checks of {get,put}{static,field}. The J9 guys commented that the double-resolution thing is familiar to them. Lots of that implementation noise would drop away if there were enough CP entries to go around for each distinct type of reference. Why would we even consider such a change? Because right now we have to add more usage request types to cover value types, and eventually templates/generics. In today's EG meeting I was advocating pushing harder on the current model of fewer, more overloaded constants for Valhalla, because that's what we do today. Then we collectively realized that constant overloading has always been such a royal pain that nobody wants to keep doing it. So, for the purpose of argument, we pivoted toward the other extreme, in a brief discussion of a hyper-split CP with basically one constant per instruction type. (I think, as a matter of design esthetic, Java tends to lump more than it splits. The original decision in CP design, to lump more functions onto fewer CP entries, made a superficially simpler constant pool. It has been a burden on JVM implementors, who agonize even to this day on how to make CP a random access data structure with element types of widely varying size.) This idea of CP splitting is food for thought which (I think) can help us settle more confidently on a fair compromise in a real release. This design approach prompts us to consider, in the nearer term, a few new CP types, incrementally added to the current design. For example, QType[?] which derives the Q-mode version from a class artifact: ldc[ CONSTANT_QType[Class["Foo"]] ] getfield[ CONSTANT_Fieldref[QType[Class["Foo"]], NameAndType["bar", "I"]] ] invokespecial[ CONSTANT_Methodref[QType[Class["Foo"]], NameAndType["baz", "()F"]] ] Or maybe the Q-mode-ness goes into the field or method reference: ldc[ CONSTANT_Dynamic[[get the Q-type from the L-type], Class["Foo"]] ] getfield[ CONSTANT_VFieldref[Class["Foo"]], NameAndType["bar", "I"]] invokespecial[ CONSTANT_VMethodref[Class["Foo"]], NameAndType["baz", "()F"]] In any case, mode information (Q vs. L vs. ?) is incompressible. What I mean is that the Q/L distinction has to go somewhere, either the instruction or the symbolic reference stored in the constant pool. (Symbolic, not resolved, is an important distinction here. The instruction can always resolve the CP reference and dip into the runtime bits, but the verifier greatly prefers to operate on the pre-resolved symbolic references.) To avoid the extra CP types, we could squeeze all the mode-ish bits up into the instructions as follows: vldc[ Class["Foo"]] ] vgetfield[ CONSTANT_Fieldref[QType[Class["Foo"]], NameAndType["bar", "I"]] ] vinvoke[ CONSTANT_Methodref[QType[Class["Foo"]], NameAndType["baz", "()F"]] ] But the problem with modal-instructions-plus-nonmodal-constants is that each constant has to be prepared to be resolved in several modes. (Lumping constants means more resolution information per constant.) Splitting the constants allows (though does not require) nonmodal constants. As noted in the meeting, a possible simplification of Minimal Value Types is we don't need to overload Class since we could easily have different names running around: Class["Foo"] vs. Class["Foo$DVT"] or the like. That means that we could, for the moment, continue to overload Fieldref and Methodref, as long as each as only used in exactly one mode (to be hashed out at link-time). But that's only a short-term help, not a long term design. ? John P.S. We could also duplicate the mode information in both CP and instruction: vgetfield[ CONSTANT_VFieldref[Class["Foo"]], NameAndType["bar", "I"]] vinvoke[ CONSTANT_VMethodref[Class["Foo"]], NameAndType["baz", "()F"]] ?Thus starting down Overkill Road, which leads to Crazytown: vgetfield[ CONSTANT_VFieldref[ QType[Class[";QFoo;"]]], VNameAndType["bar", "I"]] P.P.S. If we go with modey CP constants, I think we need to admit, as a concession to the legacies of history, that the CONSTANT_Class guy will forever denote an L-mode type (unless we do LType[Class["foo"]]?) and we will need a different CP constant (and maybe even condy for ldc) to refer to its Q-type or U-type. P.P.P.S. If we tried to do all of the above CP splitting for real we'd be breaking so much glass that we'd feel compelled to address other design points, such as heterogeneous CPs (another not-so-good legacy), a limit of two components per CP entry, and of course the 16-bit limit. Dealing with all of that at once will be a tarpit, and we're already too busy doing important stuff. So file this note also under "Hard decisions to make when our grandchildren revamp the whole class-file format."