From karen.kinnear at oracle.com Mon Jun 5 23:21:59 2017 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Mon, 5 Jun 2017 19:21:59 -0400 Subject: notes from Valhalla meeting 5/24/17 Message-ID: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> Corrections welcome - I confess I had a fever while taking notes - so particularly at the end, my notes are less coherent. For the meeting Wednesday June 7, we need to discuss short-term MVT constant pool representation for value types - CONSTANT_Class_info with ?;QFoo;? vs. CONSTANT_Q/CONSTANT_Value . So please bring raw data on conceptual and implementation trade-offs between the two proposals. Valhalla EG 5/24/17 Attendees: Bjorn, Dan H, John, Dan Smith ,Frederic, Harold, Maurizio, Brian, Vlad, Lois, Karen Proposed presentations on MVT: J1: Bjorn and David Simms JVMLS: Bjorn and Karen: overview, Bjorn and Frederic: Implementation Deep Dive Schedules: JDK9 - EG working out open issues expect to delay a small amounbt, new schedule not final until updated vote future release cadence - goal ~ 6 months between releases Early Access: approval to make EA binaries available on java.net using common EA license, working out details MVT Early Access: need to work out implementation schedule with IBM and Oracle teams - in part dependent on JVMS draft of changes Condy (ConstantDynamic) update: John & Brian goal 1. raising BootstrapMethod static argument limit to s to the 64-1 (from 251) variable arity - package as an array which is unlimited if short arg list: invoke, if long, invoke with args both behaviors aligned, so specification doesn't have to specify line between short and long javadoc almost ready Dan H: note: java language today limits to 251 goal 2. support lazy resolution - resolve in BSM new BSM mode: allows catching exceptions from subsidiary constant resolution today we push arguments, proposing pushing the number of arguments, so the BSM can pull the arguments lazily Expose a way for the BSM to get the arguments Dan H: is this a BSM attribute index or a CP index? Karen: we hope this is a BSM attribute index - to reduce constraints on the runtime implementation Dan H: Need to work out requirements across redefinitions John: not specify if return old or new values Dan H: useful for expression trees in constant pool - mostly constant except for 1 or 2 parameters goal 3. Allow writing BSM more like writing ordinary methods John: Also want groups of constants to not consume CP indices ==== Constant Pool handling for Minimal Value Types (John) short-term goals: 1. value type opcodes must be able to disambiguate value type operands for verifier 2. no vm implicit conversion between Lmode and Qmode (or IJFD modes) editor's note: we need to revisit constant pool handling for MVT at our next meeting - and and focus on the short-term requirements. Need to discuss in the context of a JVMS draft with Dan Smith. This decision is the long pole on being able to deliver an early access binary. note: short-term we have 2 separate classfiles: with a primary and secondary mirrors long-term goals: 1. "Modes" - single class with two "modes"- Lmode and Qmode, so implied relationship 7 total modes: ILFDLQU, explicit at bytecode level verifier tracks modes 2. Descriptor: specify Q vs. L mode 3. There is also a union type U mode Dan H: This implies that a QType must have a header so that we can dynamically determine if we have a QType or an LType John: The header can be virtualized - e.g. in an array header or container or stored elsewhere in compiled code key point we agree on: we must always have a typed value 4. UObject would be a top type of "any", QObject would be a top type for any QType (ed. note: is this accurate? Or is UObject top type for any type that has both Lmode and Qmode?) Bjorn: ok with "any" on stack, but not ok in heap 5. Still no implicit conversions between LFoo and QFoo. Could be an implicit conversion from QFoo to UFoo, but not between any of the other types and not UFoo to QFoo 6. Belief that on the stack QFoo has the same representation as UFoo so only 1 carrier type (ed. note: need to understand more here) implementation note: The Qmode and Lmode must both use a single stack slot to allow a union type. You can ask a Qmode for size and layout (ed. note: need to make a chart of LFoo, QFoo, UFoo behaviors) Maurizio: Can UFoo do things LFoo can not? (ed. note: sorry didn't track the answer here) UFoo is a tagged union of QFoo and LFoo Brian: Still exploring: QInterface (ed. note - I thought it was UInterface so implementable by LFoo or QFoo. Does that mean implementable by any LBar even if it is not a boxed value type? I assumed yes?) also: QObject, QComparable John: need for UTypes: 1. interfaces: need to handle receiver of value or non-value 2. type variables - vs. bytecode splitting - goal of sharing bytecodes with a type parameter that is sometimes L and sometimes Q 3. top type for UObject 4. LambdaForm - need for value type top object like QObject today, with type variables will need UObject longer term Maurizio: how far should JVMS go for MVT? MVT issue 1: hide QObject? Internally today we are using java/lang/___Value - we don't need a common parent - we could use a wild card marker - current uses are for vreturn in LambdaForms so we don't need to specialize for each value type, and for method signature parameters note: if verify LambdaForm - we need to be able to do special handling for this limited top type for value types Dan H: note IBM not using the top type in implementation - not yet needed - different approach to LambdaForm implementation - not yet 292 templates for MethodHJandles - so need to find out implementation options before we decide here John: with Valhalla Value Types - top types will exist (although they may be interfaces not classes) Frederic: today - top type needs restrictions - not used as a field or array element, more like a wildcard MVT issue 2: CONSTANT_Value vs. CONSTANT_Type vs. CONSTANT_Class extensions? John: bytecode must know modes requirements: - verifier must be able to determine value type vs. reference without additional eager class loading (ed. note: this won't be true in future for embedded fields) Bjorn: always resolve to LType today and if QType, go to the secondary mirror trade-offs: option 1: different bytecodes - e.g. vgetfield vs. getfield, multivnewarray vs. overload anewarray/multianewarray Dan H: note that get and set resolution storage is short on bits so overloading would be cleaner if we had two different constant pool slots, with no risk of trying to share (ed. note: this implies to me that for a given BCI, in MVT at least, we want different constant pool slots, not a model in which we have for instance a mode indirection to a shared constant pool slot) Today: using naming convention ($Q) hotspot proposed: CONSTANT_Value_info J9 proposed: CONSTANT_Q - I think the name is different, but the concept is the same John: longer term: UTypes will have class Foo, mode U, with a single classfile deriving 2 types Dan Smith: constant pool entry sharing? Maurizio: long-term goal: if in future a class changes declaration to be a value class, want this to still work. Easier if bytecodes are common and constant pool reflects type Karen: longer term typed bytecodes note: instanceof, checkcast - for L only for MVT: LDC is only LTypes also From daniel.smith at oracle.com Wed Jun 7 06:43:28 2017 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 7 Jun 2017 00:43:28 -0600 Subject: Draft of spec for Minimal Value Types Message-ID: <48EE3EAE-AB12-4742-9B1E-6E0D97B104B7@oracle.com> Please see the following for a set of changes to JVMS to support our value types prototyping efforts. http://cr.openjdk.java.net/~dlsmith/values.html The intent is to reference this document in an umbrella JSR as a set of features that may be optionally implemented by a JVM without any compatibility promises for future versions. (This is in the spirit of Incubator Modules, JEP 11.) Some details are still being ironed out, but I wanted to share with a broader audience. Feedback is welcome! Thanks, Dan From john.r.rose at oracle.com Wed Jun 7 19:53:19 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 7 Jun 2017 12:53:19 -0700 Subject: What's in a CONSTANT_Class? Message-ID: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> (From today's discussion internally and with IBM.) Dan Smith and Maurizio point out that a C_Class CP entry has many uses. Some of them are type-like, and some are file-like. Example file-like uses are this_class, InnerClasses, EnclosingClass (refer to a class-file). Example type-like uses are ldc (for an arbitrary jl.Class mirror), Fieldref (for now, assume no distinct vgetfield) Definitions: A _verifier type_ is the ordered pair of a class name and a usage mode (or kind). The class name is a C_Utf8 such as underlies a C_Class. The mode is one of {L,Q,U} aka {ref,val,any}. A _class-file_ is a singular body of bytecodes and metadata that translates (a portion of) a source file. Two or three types with the same name might be loaded from one file, because of mode distinctions. (Also in the future, many param-type species from a single template file.) (Also, we could refactor array types and/or method descriptors as derived from nested types. See Pack200.) More file-like uses: The head (template name) of a param-type species. More type-like uses: super_class/interfaces (extending a param-type species), annotation, catch_type, Exceptions, new, instanceof/checkcast. Many of the the type-like uses only make sense with L-mode (reference). Obviously, since today's JVM does not support Q/U modes, the question doesn't even come up. In L-only cases, we *could* say that the type-like use can refer to a file-like CP constant node, with an proviso that the CP node has a default mode of L-mode. That's not clean but might be desirable to ease adoption and backward compatibility. Which type-like uses extend to other modes? The poster child is getfield[FieldRef[recv,NT[?]] where the mode of the getfield (Q/L/U) determines the type of the receiver on the stack, so must be fully explicit (long before any class-file is loaded). The "recv" substructure of this getfield must carry the mode information. (Quick aside: We could carry the mode information in the bytecode only. There are two objections to this: First, bytecode points are scarce and so we prefer overloading existing code points. Second, CP nodes are important cache points for information which quickens bytecode execution. If the mode information is *only* in the bytecode, it follows that the quickening resources on the Fieldref node must serve *all modes*, which is potentially an implementation challenge. This is true even if we end up aligning the per-mode layouts as much as possible, which for other reasons seems desirable.) Another poster child for multi-mode type-like uses is ldc-of-jl.Class. Plan of record is to have one jl.Class mirror per mode, even though that means more than one per file. (See forthcoming note about "secondary class mirrors".) Given that jl.Class has this type-oriented structure (not file-oriented), shouldn't CONSTANT_Class have the same structure? Maybe. But this might also be the tail wagging the dog: ldc-of-class is less fundamental to JVM operation than Fieldref; it is a relatively recent introduction. Going back to mode representation, there are several ways to make the mode information available to the JVM bytecode that performs a getfield: 1. Wrap a new CP node (a "mode node") around the file-oriented C_Class node - Q[Class["Foo"]] 2. Insert a new CP node inside the type-oriented C_Class node - Class[Q["Foo"]] or Class[Q[File["Foo"]]] 3. Use a different C_Class node per mode, distinguished by name mangling - Class[";QFoo;"] 4. Use a different modal bytecode with the same CP node - vgetfield(some F) vs. getfield(the same F) 5. Use a different file per mode, with the mode available after the file is loaded There are approximately in order of preference. The last one is a no-go because it requires the verifier to load class files before verifying, which leads to vicious bootstrapping loops. Option 4 burns more code-points and not enough CP nodes. Options 1-3 are the options where the CP structure contains the modal information. There are two problems with option 3, using mangled names. First, it means that a single class file might have two CP nodes that equally refer to it, which leads to potential resolution bugs (and extra resolution work). As a principle of CP design, VM engineers would prefer that each resolvable reference to a named class file reside in a unique CP node, which then provides the cache point for resolution that other nodes derive their needed information from. This is a desirable property of the current JVM design we wish to keep. Second, requiring the system to demangle strings (";L?") to derive mode information will make it a little slower and buggier; CP tags are a more central way of carrying mode information. (There are two major counter-examples to the "one resolution site" principle: An array type constant of the form Class["[LFoo;"] resolves the class name "Foo". Something like ArrayClass[Class["Foo"]] would be closer to the one-site principle. Even worse, MethodType["(LFoo;)LBar;"] can have many class references. Again, it could be something like MethodType["(L)L", Class[Foo], Class[Bar]]. Maybe we can get closer to this ideal later, as Pack200 does. For now it is enough to note that precedent against one-site resolution exists but need not drive future design.) (Arrays are a counterexample to the principle of "use tags not mangling". Again, that choice need not drive the future design. Arguably it caused bugs; take a look at the toString method on an array or the getName method on an array class, and VM engineers could tell stories of struggling with arrays in the early days.) The remaining question is whether 1 or 2 is better: Should we wrap a mode node around a CONSTANT_Class, or should the Utf8 string of a Class be replaced with a different node type that carries mode information? From a CP-centric point of view, the first option (Q[Class["Foo"]]) seems more natural. But this pushes C_Class to the "file" role rather than the "type" role, which causes problems for "ldc" and perhaps other use cases. If someone has "dual citizenship" between the CP world and the reflective world (in the JDK) then surely the cognitive dissonance between CONSTANT_Class and java.lang.Class will grow. There are two reasons this is not a primary design-driving consideration: First, only a few folks are aware of both "worlds". Second, we are dealing with the original choice to use the word "class" for many concepts that in hindsight are distinct. (As I like to say, "lumping" is a more Java-like design move than "splitting".) Bringing in new modes at the reflective level has a natural fix in terms of pseudo-classes like int.class (vs. Integer.class, a real class), which is already out of phase with CONSTANT_Class. So the option 1 proposal looks something like this: 1. Add a new CP node type to wrap around Class[Utf8["Foo"]] to denote Q-Foo. Straw men: CONSTANT_Value[Class], CONSTANT_QMode[Class], CONSTANT_Mode['Q', Class], CONSTANT_Type[Utf8["Q"], Class]. 2. Anticipate U-mode and param-type as likely siblings to this design. 3. Anticipate the possibility of an L-mode sibling for symmetry. 4. Use a QMode node where a naked Class would otherwise imply L-mode. 5. Continue to use a naked Class where L-mode is unambiguous. This seems like a reasonable short-term experiment. It is likely there are downsides to it which we will encounter as we experiment with it. The implication is that a naked Class node means mainly the file, but if you press it into service as a type, it sprouts the L-mode. What about option 2, where Class nodes are *always* types, and only secondarily refer to files? It is more tricky than in option 1 to preserve the "one file resolution site" design feature, since there would be several Class nodes for one file. We could address this straight-on by adding a CONSTANT_ClassFile node, and deprecating the CONSTANT_Class node as a carrier for a file-only reference. (This would impact this_class, InnerClasses, and other file-only uses of constants.) In the general non-legacy case, the substructure of CONSTANT_Class would have to include both a ClassFile and some mode information. A proposal for that might look like this: 1. Add a new CP node type File[Utf8] to be the resolution of a class file. 2. Add a new binary CP node ModeAndFile (like NameAndType) to carry both a mode and a file reference. 3. Allow (eventually require?) the Utf8 of a Class node to be replaced by a ModeAndFile: Class[ModeAndFile[Utf8["Q"],File[Utf8[""]]]. 4. Anticipate a variety of modes (Q/L/U) in the ModeAndFile structure, including perhaps array and param-type species syntaxes. 5. Use Class nodes wherever types are required. 6. For compatibility, allow an abbreviated form Class[Utf8], at least when it is the only reference to a class file in a CP. (7. Bonus: Maybe File is also useful as a reference to a resource file? We do need a way to import blocks of bit-data into CPs. Current thinking is that inlining the bits like Utf8 is good enough.) Comparing these options in detail makes me comfortable with declaring that a CONSTANT_Class is *mainly* a file reference, and *also* an L-mode type. That is, it seems OK to go with option 1 in the Minimal Value Type time frame, and even the long term, until or unless we realize that option 2 (or some undiscovered option) is better. I should also say that Dan Smith, in writing the JVM spec. for this, is producing additional evidence that points toward option 2, "Class is a type not a file". So we may well pivot in that direction after our MVT experience is done. ? John From karen.kinnear at oracle.com Wed Jun 7 20:28:02 2017 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 7 Jun 2017 16:28:02 -0400 Subject: minutes Valhalla EG June 07, 2017 Message-ID: Valhalla EG Minutes June 07, 2017 attendees: Bjorn, Dan H, Dan S, John, Vlad, Frederic, Lois, Brian, Maurizio, Karen AI ALL: Dan Smith sent an initial draft of a JVMS with experimental support for MVT for review. Feedback in email requested - sooner rather than later please. AI ALL: Review embedded proposal for issue 1 - John javadoc to avoid exposing internal derived value type name Review embedded proposal for EA for handling CONSTANT_Class Timing note: Value type exploration is following three timeframes: Minimal Value Types Early Access (EA) - goal: ASAP so we can get feedback from initial users Minimal Value Types (MVT) - goal: w/JDK10 for much broader feedback Valhalla Value Types - "real" vs. shady values - much richer feature set Some of the issues we are exploring - such as type vs. class will need to evolve, so we need to reach decisions on our initial EA stake in the ground ASAP. For that - review of and conclusions to JVMS and other open issues is needed. Issue 1: Exposure of mirror and mirror name for the value class Bjorn: (please correct any inaccuracies) IBM implementation does NOT expose the value type mirror name ValueType.valueClass is the only way to get the value type mirror getClassName returns the same answer 2 java objects, same underlying data no internal derived value type name is exposed John: proposal for breaking the link to the secondary mirror Model is that there is one primary mirror and multiple secondary mirrors Brian: one nominal class and multiple derived classes analogous to a DirectMethodHandle and derived MethodHandles Later reflection couild add APIs at the java level to get the secondary mirrors - has an initial proposal in which you pass in head class, user data (e.g. value type descriptor), user-chosen name name is not resolvable, doesn't work for findClass, but visible when reflecting Dan H: do we need to ensure user name/user data consistent? That has been an issue in related APIs? John: no Karen: assume we can not use this name to look up a class (forName)? just for reflection to print? John: not for lookup Maurizio: this could be useful today (i.e. for EA) for a value class Issue: Reflection behavior for EA Karen: we already agreed reflection will not work - will throw an exception Maurizio: it could be actually easier to use John's factory than to throw an exception Timing: AI: John - send out javadoc to EG derived class := Class.derivedClassFactory(Class mainClass, T userData, String name) All: evaluate proposal both for doability also evaluate for timing: EA vs. MVT? Issue 2: Constant Pool representation for derived value type (JVMS term: value class) Goals: 1. cache point for usage - need separate storage for DVT and VCC 2. prefer not to do string parsing over and over to get the mode 3. verifier ensure type safety without additional eager class loading 4. ensure single resolution of underlying value-capable-class (longer-term want single resolution of underlying source classfile) 5. allow implementations to support older classfiles 6. tool support - make sure this works for a mix of constant pool changes e.g. tools that do not know about new versions still instrument new classfiles - need to make sure these still work as much as possible - so for these folks we need to not change the meaning of CONSTANT_Class 7. future - make sure the model works for future derivation from more than one type - e.g. Foo 7a. request that for a Parameterized Type: this_class (name and CONSTANT_Class today) allows lazy resolution of the list (ed. note: need to discuss details of "lazy" here - loading the class file perhaps, but instantiating a type from it will need the parameterizations, so far we have conceptually recorded the loaded class file under the "head" type, with default/erased parameterizations) 8. upside opportunity: Constable and pattern matching - helpful if all class objects were represented the same way when generating bytecode e.g. int.class vs. Integer.class require different handling today 9. migration: a class should be able to migrate to being a value type approach: will require boxing to access, but if you pass for example a boxed value type the current client should continue to work 10. migration: value type to reference? Open question 11. ed. note: we did not mention that for MVT we actually have multiple source classfiles and at least one potential prototype for parameterized types also generates separate classfiles. While we strongly do not want to build in the concept of multiple separate classfiles, it would be valuable if the constant pool representation was able to support that. This might help extend to nested classes as well. John: two views of CONSTANT_Class: 1. is it a type? then need nested CONSTANT_ClassFile: Class[ClassFile] 2. is it a loaded class file? then need surrounding decoration Type[Class] bad third choice: 3. use separately resolved peers Class["name.1"], Class["name.2"] where name is mangled but refers to same loaded class file Today: CONSTANT_Class represents both the type and the loaded class file. Dan S: Prefers option 1: type with a reference to a raw classfile (ed. note - Dan S - I didn't get any details on why you prefer this for longer-term, it would help to understand) (ed. note - Dan S - can you add some notes on how to represent goal #11?) Bjorn: Are there any cases in which a classfile does not represent a reference type? Valhalla - goal is that the classfile represents both value type and the boxed value type Proposal from John/Karen: CONSTANT_Class_info: used for both the "head" LType and the classfile - conflate for backward compatibility CONSTANT_Value_info: mode information and references the underlying CONSTANT_Class_info Parameterized type: has auxiliary info and references at least on underlying (head) CONSTANT_Class_info e.g. List would reference List class, and Foo would linked to in auxiliary info AI: All Explore this potential model: Would this make sense from a JVMS perspective? Would this work for JVM implementations? Would this work for bytecode generation etc? - please check if this is feasible short-term, i.e. for EA - also explore if we could "flop" to #1 later Feedback: John: short term: Try the proposal above with CONSTANT_Value_info refering to an underlying CONSTANT_Class_info Spec direction is likely to be CONSTANT_Class references a Classfile Lois: ok with verifier Frederic: ok with bytecodes Bjorn: prefer #3 ;Q - for MVT - it has minimal impact ok with #2, but the change wouldn't slow us down much Frederic: todwy - handle by using separate opcodes and all have CONSTANT_Class (ed. note - and we haven't implemented the verifier which would like to sanity check that the CONSTANT_Class matches the expected type in the opcode) --- 3. Ways to phase out current classfile capabilities such as CONSTANT_Class? Or LDC CONSTANT_Class? Brian: jigsaw added the (not so popular) concept of runtime warnings Maurizio - jsr/ret deprecated (John: ACC_SUPER) - not used much - javac was primary client at the time StackMapTable added - lots of complaints What if long-term we were to derive arrays using multi-level constant pool entries What if we were to support Q[ as well as L arrays? - immutable, non-nullable, identityless John: What if we were to evolve CONSTNAT_MethodType - currently a flat string to use tree structured approach? Dan H: limited numbers of places we parse descriptors Karen: don't slow down our resolution method and field lookup Dan H: limited to resolution time Dan S: might be faster to look up using tree comparison vs. extra long symbols ed. note: worth exploring - just be really sure there is no potential for ambiguity, i.e. at any resolution step: do you have a match for X or Y --- 4. Box and value implementation relationships goal: reduce costs when boxing/unboxing -e.g. same layout alignment for fields - what if we could just change 1 bit in the carrier --- 5. Opcode proposal: drop vgetfield, overload getfield instead Bjorn: concern - not want performance impact on existing opcodes John: propose discard the extra defined opcode and leave room for quickening overload: anewarray, multianewarray Chat room notes on details: John: uses of C_Class as a class-file: this_class, PType head-type (template) uses of C_Class as a type: ldc, ?Field/Methodref? Dan S: Arbitrary types allowed: anewarray, multianewarray, structural descriptor, ldc, bootstrap argument, verification_type_info, maybe field/method refs Reference types allowed: checkcast, instanceofReference class/species allowed: super_class, interfaces, new, annotation, catch_type, Exceptions Plain class required: this_class, InnerClasses, EnclosingMethod, maybe field/method refs Maurizio: was thinking the recently that InnerClasses is probably another 'classfile' use John: Agree ldc is not arbitrary type, b/c ldc int.class not possible; ldc is L-type only From daniel.smith at oracle.com Fri Jun 9 00:00:34 2017 From: daniel.smith at oracle.com (Dan Smith) Date: Thu, 8 Jun 2017 18:00:34 -0600 Subject: What's in a CONSTANT_Class? In-Reply-To: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> References: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> Message-ID: <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> Some initial notes below attempting to flesh out what our two long-term options look like. > On Jun 7, 2017, at 1:53 PM, John Rose wrote: > Comparing these options in detail makes me comfortable with > declaring that a CONSTANT_Class is *mainly* a file reference, > and *also* an L-mode type. Let me highlight this as the source of all these problems. Trying to make a single constant pool entry represent two different things is painful. It leads to confusion about the model, tortured language explaining basic things like what gets "returned" from resolution, attempts to explain away cases that don't follow the rules, bugs, etc. That said, we must live with the legacy of years ago and make the best of it. Looking at the two viable strategies: > 1. Wrap a new CP node (a "mode node") around the file-oriented C_Class node - Q[Class["Foo"]] Here's the syntax I would use, more or less: CONSTANT_Class_info { u1 tag; // 7 u2 name_index; // Utf8 } CONSTANT_PrimitiveType_info { u1 tag; // 19 u1 type_code; // 'Z'=90 or 4, 'C'=67 or 5, 'B'=66 or 8, 'S'=83 or 9 // 'I'=73 or 10, 'J'=74 or 11, 'F'=70 or 6, 'D'=68 or 7 } CONSTANT_ClassType_info { u1 tag; // 20 u1 mode_code; // 'L'=76 or 12, 'Q'=81 or 13 u2 class_index; // Class } CONSTANT_ArrayType_info { u1 tag; // 21 u2 component_index; // PrimitiveType, ClassType, ArrayType, or SpeciesType } CONSTANT_SpeciesType_info { u1 tag; //22 u1 mode_code; // 'L'=76 or 12, 'Q'=81 or 13 u2 class_index; // Class u2 enclosing_index; // ClassType or SpeciesType u2 typearg_count; u2 typeargs[typearg_count]; // PrimitiveType, ClassType, ArrayType, or SpeciesType } CONSTANT_MethodDescriptor_info { u1 tag; // 23 u2 parameter_count; u2 parameter_descriptors[parameter_count]; // PrimitiveType, ClassType, ArrayType, or SpeciesType u2 return_descriptor; // PrimitiveType, ClassType, ArrayType, SpeciesType, or 0 (void) } CONSTANT_FieldDescriptor_info { // is this wrapper useful? u1 tag; // 24 u2 type_index; // PrimitiveType, ClassType, ArrayType, or SpeciesType } (I thought about a CONSTANT_Type_info union rather than all these flavors of type constants, but it's not great because 1) constant pool entries already form a tagged union, so we don't need another union layer, and 2) CONSTANT_Class_info can also be used to represent types?once you've got 2 flavors, might as well have 5+.) > 2. Insert a new CP node inside the type-oriented C_Class node - Class[Q["Foo"]] or Class[Q[File["Foo"]]] Possible syntax for this: CONSTANT_Class_info { u1 tag; // 7 u2 name_index; // Utf8, PrimitiveDescriptor, ClassDescriptor, ArrayDescriptor, SpeciesDescriptor } CONSTANT_PrimitiveDescriptor_info { u1 tag; // 19 u1 type_code; // 'Z'=90 or 4, 'C'=67 or 5, 'B'=66 or 8, 'S'=83 or 9 // 'I'=73 or 10, 'J'=74 or 11, 'F'=70 or 6, 'D'=68 or 7 } CONSTANT_ClassDescriptor_info { u1 tag; // 20 u1 mode_code; // 'L'=76 or 12, 'Q'=81 or 13 u2 class_index; // ClassFile } CONSTANT_ClassFile_info { u1 tag; // 25 u2 class_index; // Utf8 } CONSTANT_ArrayDescriptor_info { u1 tag; // 21 u2 component_index; // PrimitiveDescriptor, ClassDescriptor, ArrayDescriptor, or SpeciesDescriptor } CONSTANT_SpeciesDescriptor_info { u1 tag; //22 u1 mode_code; // 'L'=76 or 12, 'Q'=81 or 13 u2 class_index; // ClassFile u2 enclosing_index; // ClassDescriptor or SpeciesDescriptor u2 typearg_count; u2 typeargs[typearg_count]; // PrimitiveDescriptor, ClassDescriptor, ArrayDescriptor, or SpeciesDescriptor } CONSTANT_MethodDescriptor_info { u1 tag; // 23 u2 parameter_count; u2 parameter_descriptors[parameter_count]; // PrimitiveDescriptor, ClassDescriptor, ArrayDescriptor, or SpeciesDescriptor u2 return_descriptor; // PrimitiveDescriptor, ClassDescriptor, ArrayDescriptor, SpeciesDescriptor, or 0 (void) } CONSTANT_FieldDescriptor_info { // is this wrapper useful? u1 tag; // 24 u2 type_index; // PrimitiveDescriptor, ClassDescriptor, ArrayDescriptor, or SpeciesDescriptor } -------- Here's an overview of spec changes, assuming one of the sets of syntactic changes above. As I look at this, both approaches seem mostly fine. Option (1) has messier rules for resolution, because it has to deal with the duality of CONSTANT_Class. Option (2) has messier treatment of this_class, in exchange for eliminating the duality of CONSTANT_Class. The rules about where types can appear can be additive (new constants allowed in certain places) or negative (certain kinds of CONSTANT_Class disallowed in certain places), but either way, you've *mostly* got to touch all of the same places. Syntax Need to describe where certain kinds of types or class references can appear. In option (1), some of this can be enforced to some extent by limiting the types of constants allowed in certain places. But, generally, both option (1) and option (2) will need informal format or static constraints (4.8, 4.9.1) that disallow certain structures that encode certain kinds of types. Descriptors of fields/methods can be expressed as strings or MethodDescriptor/FieldDescriptor structures. CONSTANT_NameAndType, CONSTANT_MethodType, LocalVariableTable, and annotations allow descriptor_index to point to any of these (prohibiting method or field descriptors as appropriate). "The same descriptor" is defined as a recursive comparison of the parts. It does not involve resolution or loading. It allows a string descriptor to possibly match a structured MethodDescriptor/FieldDescriptor. (This definition applies, among other things, to the prohibition of duplicate field/method declarations.) A (maybe) comprehensive list of where classes/types can appear: - Simple class references (CONSTANT_Class with a simple class name for (1), CONSTANT_Class representing a class type for (2)): ClassFile.this_type InnerClasses EnclosingMethod (All we want is the class, but for compatibility a CONSTANT_Class must be allowed here, so (2) takes the position that these are encoded as types.) - Any class type (CONSTANT_Class with a simple class name or CONSTANT_ClassType/CONSTANT_SpeciesType for (1), CONSTANT_Class representing a class type for (2)): ClassFile.super_class Fieldref.class_index Methodref.class_index InterfaceMethodref.class_index - Reference class type (CONSTANT_Class or CONSTANT_ClassType/CONSTANT_SpeciesType representing a reference class type for (1), CONSTANT_Class representing a reference class type for (2)): new Code.exception_table.catch_type Exceptions.exception_index_table - Array type (CONSTANT_Class representing an array type or CONSTANT_ArrayType for (1), CONSTANT_Class representing an array type for (2)): multianewarray - Reference type (CONSTANT_Class, CONSTANT_ArrayType, or CONSTANT_ClassType/CONSTANT_SpeciesType repesenting a reference class type for (1), CONSTANT_Class representing a reference type for (2)): instanceof checkcast - Any type (CONSTANT_Class, CONSTANT_ArrayType, CONSTANT_ClassType, CONSTANT_SpeciesType, or CONSTANT_PrimitiveType for (1), CONSTANT_Class for (2)): anewarray ldc verification_type_info.Object_variable_info BootstrapMethods.bootstrap_arguments Verification - Types and descriptors of all forms can be parsed to verification types without any resolution or loading. (Many of the changes in the current value classes spec are there to support this.) Resolution For (1), a CONSTANT_Class can be "resolved" or "resolved as a type". Plain resolution is only allowed where we've asserted that the name is not an array type descriptor. It produces a loaded class. In contexts where type structures can appear, if a CONSTANT_Class is also allowed, resolving the type implicitly means the CONSTANT_Class is "resolved as a type", which will treat it as a ClassType with mode 'L'. Resolution of a type produces a java.lang.Class (or some equivalent internal representation). For (2), a CONSTANT_ClassFile is always resolved to a loaded class. A CONSTANT_Class is always resolved to a type. In either case, descriptors are not resolved. (This includes all the type-related structures called "descriptors" in (2). Though the implementation might choose to lazily cache some resolved types with them.) Semantics - Various cleanups to ensure that, downstream from resolution, we're talking about "types" rather than "classes and interfaces". (Again, much of this is already in the value classes spec.) ?Dan From john.r.rose at oracle.com Fri Jun 9 02:09:25 2017 From: john.r.rose at oracle.com (John Rose) Date: Thu, 8 Jun 2017 19:09:25 -0700 Subject: What's in a CONSTANT_Class? In-Reply-To: <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> References: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> Message-ID: <562470C2-E737-45B9-81EA-B9A14607527D@oracle.com> On Jun 8, 2017, at 5:00 PM, Dan Smith wrote: > > Some initial notes below attempting to flesh out what our two long-term options look like. I like both of your sketches. I think we should also try this variation: CONSTANT_Class is legacy only. That way folks won't encounter CONSTANT_Class as a False Friend, as they encounter it in new CP structures. It is *neither* mainly a file nor mainly a type, but only a legacy abbreviation. For loading a class file we have CONSTANT_ClassFile and for naming a class type we have CONSTANT_ClassType (and the other types of 1). The legacy meaning of CONSTANT_Class is retained, but the preferred translation of "String.class" is ldc[CONSTANT_ClassType['L',CFS]] where CFS is CONSTANT_ClassFile[Utf8["java/lang/String]]. An object string field is getfield[CONSTANT_Fieldref[ClassType['L',...], NameAndType["myStr", CONSTANT_ClassType['L',CFS]]]]. The stringy type descriptors are tucked away inside C_NameAndType. I guess it's sufficient to say that the second component of C_NAT preferentially points at a C_XType (for X in Primitive, Class, Array, Species) but may also point at legacy Utf8. The Utf8 semantics could be defined by expansion to hypothetical CP entries (a point you already made). ? John From john.r.rose at oracle.com Fri Jun 9 02:44:25 2017 From: john.r.rose at oracle.com (John Rose) Date: Thu, 8 Jun 2017 19:44:25 -0700 Subject: What's in a CONSTANT_Class? In-Reply-To: <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> References: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> Message-ID: (more comments) On Jun 8, 2017, at 5:00 PM, Dan Smith wrote: > > > CONSTANT_Class_info { > u1 tag; // 7 > u2 name_index; // Utf8 > } If we decide to sideline the previous guy as a False Friend, then this is the place where resolution really happens: CONSTANT_ClassFile_info { u1 tag; // 25 u2 name_index; // Utf8 } > CONSTANT_PrimitiveType_info { > u1 tag; // 19 > u1 type_code; // 'Z'=90 or 4, 'C'=67 or 5, 'B'=66 or 8, 'S'=83 or 9 > // 'I'=73 or 10, 'J'=74 or 11, 'F'=70 or 6, 'D'=68 or 7 > } Alternative encoding: Assign a compact range of tags 32..39, one per primitive. Another alternative: Hardwire the top 8 CP indexes (starting at 2^16-9). But these alternatives just remove a minor eyesore from class files; instead of lots of UTF8 encodings there will be a little dance at the beginning of every CP that recalls to mind the perennial favorites 'int', 'boolean', etc. For the CP type system, one type for primitives is better, I guess. I slightly prefer the smaller code points, because they are easier to decode with a short array. But a perfect hash code would be a clever alternative for either encoding. If we use Utf8 strings for types (in non-legacy CP structure) then the actual ASCII code points would be more appealing. > > CONSTANT_ClassType_info { > u1 tag; // 20 > u1 mode_code; // 'L'=76 or 12, 'Q'=81 or 13 > u2 class_index; // Class s/Class/ClassFile/ > } > > CONSTANT_ArrayType_info { > u1 tag; // 21 > u2 component_index; // PrimitiveType, ClassType, ArrayType, or SpeciesType > } > > CONSTANT_SpeciesType_info { > u1 tag; //22 > u1 mode_code; // 'L'=76 or 12, 'Q'=81 or 13 > u2 class_index; // Class > u2 enclosing_index; // ClassType or SpeciesType > u2 typearg_count; > u2 typeargs[typearg_count]; // PrimitiveType, ClassType, ArrayType, or SpeciesType > } s/Class/ClassFile/ ?which raises the question of whether the species is type-like or file-like. The mode_code also raises this question. Why must a mode also be assigned when a template is expanded? When a class file is loaded, a mode is not assigned. Perhaps both class files and species are "pre-types", things with names and typed members, but which are not yet themselves types. > CONSTANT_MethodDescriptor_info { > u1 tag; // 23 > u2 parameter_count; > u2 parameter_descriptors[parameter_count]; // PrimitiveType, ClassType, ArrayType, or SpeciesType > u2 return_descriptor; // PrimitiveType, ClassType, ArrayType, SpeciesType, or 0 (void) > } The void quasi-type should be lumped into PrimitiveType, for the sake of ldc (void.class). > CONSTANT_FieldDescriptor_info { // is this wrapper useful? > u1 tag; // 24 > u2 type_index; // PrimitiveType, ClassType, ArrayType, or SpeciesType > } I don't think this wrapper is useful. Instead we have the lopsided distinction between the star in FieldRef[,NameAndType[,*]] and the star in MethodRef[,NameAndType[,*]]. In the case of FieldRef, it is any of the types (but not PT-void), and in the case of MethodRef, it is a MethodDescriptor. MethodDescriptor is an extra tricky nut to crack here, I think, because it has an unlimited arity. That makes logical sense, but major JVMs (IBM, ours) have baked in an assumption that CP entries are fixed in size except for Utf8 strings. In JSR 292 we pushed the BSM specifiers into a side table for this reason. We could put method descriptor lists into a similar side table. I don't have a good suggestion here. For method types the flat Utf8 strings are seductive, at least until you have 100 repetitions of the substring "Ljava/lang/Object;". If we break the arity limit of 2, then we should also consider merging NameAndType into FieldRef and MethodRef, at which point the genericity of NameAndType becomes moot. The three components of a FieldRef would be (holder:ClassType,name:Utf8,:type:XType) and the components of a MethodRef could be (holder:ClassType, name:Utf8,descr:MethodDescriptor). At that point the MethodD. could be unfolded into the MethodRef, right? Then the only high-arity node would be MethodRef. (Except for C_MethodType. But that could be made a legacy guy also, since he is built on top of flat strings, and condy can materialize him easily enough.) > (I thought about a CONSTANT_Type_info union rather than all these flavors of type constants, but it's not great because 1) constant pool entries already form a tagged union, so we don't need another union layer, and 2) CONSTANT_Class_info can also be used to represent types?once you've got 2 flavors, might as well have 5+.) Yep. And you could push that a little farther by giving each PrimitiveType its own tag. The PTs are the odd thing here. There are no constants except them that have a payload of less than a byte. Just as constants seem to have a maximum size (arity 2) they also seem to have a minimum size (32 bits or so). Note that very small integer constants (which would correspond to PT sub-tags) are *not* usually stored in the CP; they are loaded with short instructions like "bipush", not "ldc". ? John From john.r.rose at oracle.com Fri Jun 9 09:40:39 2017 From: john.r.rose at oracle.com (John Rose) Date: Fri, 9 Jun 2017 02:40:39 -0700 Subject: What's in a CONSTANT_Class? In-Reply-To: <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> References: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> Message-ID: <0639DCD0-74A2-4AFB-8E92-E07FE3027A51@oracle.com> Whatever tag numbering scheme we do will have to move up by two, since CONSTANT_Module_info.tag = 19 and CONSTANT_Package_info.tag = 20. I think we can take our next node at tag 21, for CONSTANT_Q or CONSTANT_Value or whatever we were calling it on Wednesday. I particularly like your ClassType_info; I'd like to use that for MVT, with T_VALUETYPE (14) as the only mode_code that is valid, at first. Later T_OBJECT (12) for symmetry, and T_UNION (3? 16?). (Much much later T_INT or another primitive, associated with a Class, is a very interesting thing to contemplate.) ? John On Jun 8, 2017, at 5:00 PM, Dan Smith wrote: > > CONSTANT_PrimitiveType_info { > u1 tag; // 19 > u1 type_code; // 'Z'=90 or 4, 'C'=67 or 5, 'B'=66 or 8, 'S'=83 or 9 > // 'I'=73 or 10, 'J'=74 or 11, 'F'=70 or 6, 'D'=68 or 7 > } > > CONSTANT_ClassType_info { > u1 tag; // 20 > u1 mode_code; // 'L'=76 or 12, 'Q'=81 or 13 > u2 class_index; // Class > } From daniel.smith at oracle.com Fri Jun 9 15:43:08 2017 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 9 Jun 2017 09:43:08 -0600 Subject: What's in a CONSTANT_Class? In-Reply-To: References: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> Message-ID: <559248EF-F86D-4DA9-8BBE-AC737C5AD792@oracle.com> > On Jun 8, 2017, at 8:44 PM, John Rose wrote: > > The void quasi-type should be lumped into PrimitiveType, for the sake > of ldc (void.class). I see the appeal, though it also, unfortunately, expands the set of "primitive types" and means we have to restrict that set at the use sites: CONSTANT_ArrayType_info { u1 tag; // 21 u2 component_index; // PrimitiveType **but not void**, ClassType, ArrayType, or SpeciesType } CONSTANT_SpeciesType_info { u1 tag; //22 u1 mode_code; // 'L'=76 or 12, 'Q'=81 or 13 u2 class_index; // Class u2 enclosing_index; // ClassType or SpeciesType u2 typearg_count; u2 typeargs[typearg_count]; // PrimitiveType **but not void**, ClassType, ArrayType, or SpeciesType } CONSTANT_MethodDescriptor_info { u1 tag; // 23 u2 parameter_count; u2 parameter_descriptors[parameter_count]; // PrimitiveType **but not void**, ClassType, ArrayType, or SpeciesType u2 return_descriptor; // PrimitiveType, ClassType, ArrayType, SpeciesType, or 0 (void) } CONSTANT_FieldDescriptor_info { // is this wrapper useful? u1 tag; // 24 u2 type_index; // PrimitiveType **but not void**, ClassType, ArrayType, or SpeciesType } - Any non-void type (CONSTANT_Class, CONSTANT_ArrayType, CONSTANT_ClassType, CONSTANT_SpeciesType, or CONSTANT_PrimitiveType **that isn't void**): anewarray verification_type_info.Object_variable_info - Any type or void (CONSTANT_Class, CONSTANT_ArrayType, CONSTANT_ClassType, CONSTANT_SpeciesType, or CONSTANT_PrimitiveType): ldc BootstrapMethods.bootstrap_arguments I prefer the discipline of making 'void' a separate entity (CONSTANT_Void?) that we don't necessarily call a "type", although not sure that carries its weight. > Note that very small integer constants > (which would correspond to PT sub-tags) are *not* usually > stored in the CP; they are loaded with short instructions > like "bipush", not "ldc". Yes, and that works fine for instructions (see also newarray). The new requirement here is for another constant pool entry to need to talk about one of these very small things, and in a polymorphic way (e.g., the component type of an array may be a primitive or some other type). ?Dan From john.r.rose at oracle.com Fri Jun 9 20:45:46 2017 From: john.r.rose at oracle.com (John Rose) Date: Fri, 9 Jun 2017 13:45:46 -0700 Subject: What's in a CONSTANT_Class? In-Reply-To: <559248EF-F86D-4DA9-8BBE-AC737C5AD792@oracle.com> References: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> <559248EF-F86D-4DA9-8BBE-AC737C5AD792@oracle.com> Message-ID: On Jun 9, 2017, at 8:43 AM, Dan Smith wrote: > > I prefer the discipline of making 'void' a separate entity (CONSTANT_Void?) that we don't necessarily call a "type", although not sure that carries its weight. I think on balance the JLS would be cleaner if we admitted void is a type, with some funny restrictions. (IIRC Alex tilts this way too.) Allowing in return position to assume will be attractive with new generics. From john.r.rose at oracle.com Fri Jun 9 20:51:44 2017 From: john.r.rose at oracle.com (John Rose) Date: Fri, 9 Jun 2017 13:51:44 -0700 Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> Message-ID: <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> We have talked about "condy" (a constant-pool friend for indy). Enclosed is the javadoc portion of a draft spec. The JVM spec. is not ready and will be sent separately. This javadoc reflects code checked into the condy branch of Amber. On Jun 5, 2017, at 4:21 PM, Karen Kinnear wrote: > From: Karen Kinnear > Subject: notes from Valhalla meeting 5/24/17 > Date: June 5, 2017 at 4:21:59 PM PDT > To: valhalla-spec-experts at openjdk.java.net > ? > Condy (ConstantDynamic) update: John & Brian > goal 1. raising BootstrapMethod static argument limit to s to the 64-1 (from 251) > variable arity - package as an array which is unlimited > if short arg list: invoke, if long, invoke with args > both behaviors aligned, so specification doesn't have to specify line between short and long > javadoc almost ready > Dan H: note: java language today limits to 251 > > goal 2. support lazy resolution - resolve in BSM > new BSM mode: allows catching exceptions from subsidiary constant resolution > today we push arguments, proposing pushing the number of arguments, so the BSM can pull the arguments lazily > Expose a way for the BSM to get the arguments > Dan H: is this a BSM attribute index or a CP index? > Karen: we hope this is a BSM attribute index - to reduce constraints on the runtime implementation > Dan H: Need to work out requirements across redefinitions > John: not specify if return old or new values > Dan H: useful for expression trees in constant pool - mostly constant except for 1 or 2 parameters > > goal 3. Allow writing BSM more like writing ordinary methods > > John: Also want groups of constants to not consume CP indices http://cr.openjdk.java.net/~jrose/jvm/specdiff-condy-2017-0609.zip From john.r.rose at oracle.com Fri Jun 9 20:53:14 2017 From: john.r.rose at oracle.com (John Rose) Date: Fri, 9 Jun 2017 13:53:14 -0700 Subject: What's in a CONSTANT_Class? In-Reply-To: References: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> <559248EF-F86D-4DA9-8BBE-AC737C5AD792@oracle.com> Message-ID: On Jun 9, 2017, at 1:45 PM, John Rose wrote: > > Allowing in return position to assume will be attractive with new generics. Allowing Map to assume V=void with no layout footprint derives Set. This is a trick Rust plays successfully. I've always wanted to pull that trick. From forax at univ-mlv.fr Sat Jun 10 10:21:19 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 10 Jun 2017 10:21:19 +0000 Subject: What's in a CONSTANT_Class? In-Reply-To: References: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> <559248EF-F86D-4DA9-8BBE-AC737C5AD792@oracle.com> Message-ID: <1BC8C6DE-494F-4F36-A56A-4304F33F136E@univ-mlv.fr> You're not alone :) I want the result of an async procedure call to be a CompletableFuture too. R?mi On June 9, 2017 10:53:14 PM GMT+02:00, John Rose wrote: >On Jun 9, 2017, at 1:45 PM, John Rose wrote: >> >> Allowing in return position to assume will be attractive >with new generics. > >Allowing Map to assume V=void with no layout footprint >derives Set. >This is a trick Rust plays successfully. I've always wanted to pull >that trick. -- Sent from my Android device with K-9 Mail. Please excuse my brevity. From forax at univ-mlv.fr Sat Jun 10 11:19:10 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 10 Jun 2017 13:19:10 +0200 (CEST) Subject: UFoo; ?? Was: notes from Valhalla meeting 5/24/17 In-Reply-To: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> Message-ID: <536325962.2663132.1497093550468.JavaMail.zimbra@u-pem.fr> The need for a UFoo; is not clear to me. So correct me if i'm wrong, u-bytecodes are bytecodes for representing something similar to T and Foo in Java but for the VM with the supplementary constraint that T can be either a primitive, a value type or class. I do not see how does UFoo; fit in this scheme ? Is it to represent Foo ?, i do not think so, Foo can still be represented by it's erasure, Foo, we just need a way to represent '?', which is any (java.lang.Any ?). Is it to represent Foo, i do not think so, again, it can be represented by it's raw type with an explicit cast from any to Bar when we need the value represented by '? extends Bar'. so why do we need UFoo; ?? R?mi > De: "Karen Kinnear" > ?: valhalla-spec-experts at openjdk.java.net > Envoy?: Mardi 6 Juin 2017 01:21:59 > Objet: notes from Valhalla meeting 5/24/17 > Corrections welcome - I confess I had a fever while taking notes - so > particularly at the end, my notes are less coherent. > For the meeting Wednesday June 7, we need to discuss short-term MVT constant > pool representation for value types - > CONSTANT_Class_info with ?;QFoo;? vs. CONSTANT_Q/CONSTANT_Value . So please > bring raw data on conceptual and implementation > trade-offs between the two proposals. > Valhalla EG 5/24/17 > Attendees: Bjorn, Dan H, John, Dan Smith ,Frederic, Harold, Maurizio, Brian, > Vlad, Lois, Karen > Proposed presentations on MVT: > J1: Bjorn and David Simms > JVMLS: Bjorn and Karen: overview, Bjorn and Frederic: Implementation Deep Dive > Schedules: > JDK9 - EG working out open issues expect to delay a small amounbt, new schedule > not final until updated vote > future release cadence - goal ~ 6 months between releases > Early Access: approval to make EA binaries available on java.net using common EA > license, working out details > MVT Early Access: need to work out implementation schedule with IBM and Oracle > teams > - in part dependent on JVMS draft of changes > Condy (ConstantDynamic) update: John & Brian > goal 1. raising BootstrapMethod static argument limit to s to the 64-1 (from > 251) > variable arity - package as an array which is unlimited > if short arg list: invoke, if long, invoke with args > both behaviors aligned, so specification doesn't have to specify line between > short and long > javadoc almost ready > Dan H: note: java language today limits to 251 > goal 2. support lazy resolution - resolve in BSM > new BSM mode: allows catching exceptions from subsidiary constant resolution > today we push arguments, proposing pushing the number of arguments, so the BSM > can pull the arguments lazily > Expose a way for the BSM to get the arguments > Dan H: is this a BSM attribute index or a CP index? > Karen: we hope this is a BSM attribute index - to reduce constraints on the > runtime implementation > Dan H: Need to work out requirements across redefinitions > John: not specify if return old or new values > Dan H: useful for expression trees in constant pool - mostly constant except for > 1 or 2 parameters > goal 3. Allow writing BSM more like writing ordinary methods > John: Also want groups of constants to not consume CP indices > ==== > Constant Pool handling for Minimal Value Types (John) > short-term goals: > 1. value type opcodes must be able to disambiguate value type operands for > verifier > 2. no vm implicit conversion between Lmode and Qmode (or IJFD modes) > editor's note: we need to revisit constant pool handling for MVT at our next > meeting - and > and focus on the short-term requirements. Need to discuss in the context of a > JVMS draft with Dan Smith. > This decision is the long pole on being able to deliver an early access binary. > note: short-term we have 2 separate classfiles: with a primary and secondary > mirrors > long-term goals: > 1. "Modes" - single class with two "modes"- Lmode and Qmode, so implied > relationship > 7 total modes: ILFDLQU, explicit at bytecode level > verifier tracks modes > 2. Descriptor: specify Q vs. L mode > 3. There is also a union type U mode > Dan H: This implies that a QType must have a header so that we can dynamically > determine if we have > a QType or an LType > John: The header can be virtualized - e.g. in an array header or container or > stored elsewhere in compiled code > key point we agree on: we must always have a typed value > 4. UObject would be a top type of "any", QObject would be a top type for any > QType > (ed. note: is this accurate? Or is UObject top type for any type that has both > Lmode and Qmode?) > Bjorn: ok with "any" on stack, but not ok in heap > 5. Still no implicit conversions between LFoo and QFoo. > Could be an implicit conversion from QFoo to UFoo, but not between any of the > other types and > not UFoo to QFoo > 6. Belief that on the stack QFoo has the same representation as UFoo so only 1 > carrier type > (ed. note: need to understand more here) > implementation note: > The Qmode and Lmode must both use a single stack slot to allow a union type. > You can ask a Qmode for size and layout > (ed. note: need to make a chart of LFoo, QFoo, UFoo behaviors) > Maurizio: Can UFoo do things LFoo can not? > (ed. note: sorry didn't track the answer here) > UFoo is a tagged union of QFoo and LFoo > Brian: > Still exploring: QInterface > (ed. note - I thought it was UInterface so implementable by LFoo or QFoo. Does > that > mean implementable by any LBar even if it is not a boxed value type? I assumed > yes?) > also: QObject, QComparable > John: need for UTypes: > 1. interfaces: need to handle receiver of value or non-value > 2. type variables - vs. bytecode splitting - goal of sharing bytecodes with a > type parameter that is sometimes L > and sometimes Q > 3. top type for UObject > 4. LambdaForm - need for value type top object like QObject today, with type > variables will need UObject longer term > Maurizio: > how far should JVMS go for MVT? > MVT issue 1: hide QObject? Internally today we are using java/lang/___Value > - we don't need a common parent - we could use a wild card marker > - current uses are for vreturn in LambdaForms so we don't need to specialize for > each value type, and > for method signature parameters > note: if verify LambdaForm - we need to be able to do special handling for this > limited top type for value types > Dan H: note IBM not using the top type in implementation - not yet needed - > different approach to LambdaForm implementation > - not yet 292 templates for MethodHJandles - so need to find out implementation > options before we decide here > John: with Valhalla Value Types - top types will exist (although they may be > interfaces not classes) > Frederic: today - top type needs restrictions - not used as a field or array > element, more like a wildcard > MVT issue 2: CONSTANT_Value vs. CONSTANT_Type vs. CONSTANT_Class extensions? > John: bytecode must know modes > requirements: - verifier must be able to determine value type vs. reference > without additional eager class loading > (ed. note: this won't be true in future for embedded fields) > Bjorn: always resolve to LType today and if QType, go to the secondary mirror > trade-offs: > option 1: different bytecodes - e.g. vgetfield vs. getfield, multivnewarray vs. > overload anewarray/multianewarray > Dan H: note that get and set resolution storage is short on bits > so overloading would be cleaner if we had two different constant pool slots, > with no > risk of trying to share > (ed. note: this implies to me that for a given BCI, in MVT at least, we want > different constant pool > slots, not a model in which we have for instance a mode indirection to a shared > constant pool slot) > Today: using naming convention ($Q) > hotspot proposed: CONSTANT_Value_info > J9 proposed: CONSTANT_Q > - I think the name is different, but the concept is the same > John: longer term: UTypes will have class Foo, mode U, with a single classfile > deriving 2 types > Dan Smith: constant pool entry sharing? > Maurizio: long-term goal: if in future a class changes declaration to be a value > class, want this > to still work. Easier if bytecodes are common and constant pool reflects type > Karen: longer term typed bytecodes > note: instanceof, checkcast - for L only > for MVT: LDC is only LTypes also From karen.kinnear at oracle.com Mon Jun 12 19:15:09 2017 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Mon, 12 Jun 2017 15:15:09 -0400 Subject: What's in a CONSTANT_Class? In-Reply-To: <559248EF-F86D-4DA9-8BBE-AC737C5AD792@oracle.com> References: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> <559248EF-F86D-4DA9-8BBE-AC737C5AD792@oracle.com> Message-ID: Dan, I am really glad we are exploring the longer term picture of how to handle the constant pool. (note: not to be confused with the Minimal Value Types exercise) I would like to add a couple of constraints/questions/concerns please: 1) No change in meaning of any existing constant pool entries Dan Heidinga correctly pointed out the challenge, that while we may have a classfile version on the classfile as generated originally, tools will be injecting byte codes assuming the meaning of existing constant pool entries, and will be adding constant pool entries prior to having any knowledge of classfile version changes. 2) impact on APIs I need a better understanding on how a user is going to represent the difference between a QFoo and an LFoo in source? And whether we are going to be changing/augmenting APIs that currently take a class name if we want them to extend to support value types rather than always requiring boxing. Today we have a name/loader unique runtime type guarantee and the loader can be determined from context. 3) impact on tools We need feedback from tool developers. Maurizio has mentioned concerns relative to ASM. Please look at the JNI and JVMTI type signatures - they expose for instance the JVMS BasicTypes so changes here break tools. http://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/types.html#wp16432 4) prototype support For value types and for specialization, let?s make sure that any proposal could have a prototype/ implementation that allows generation of separate classfiles, so two UTF8s. (e.g. what I was proposing was that any derived type would have both its own name and a link to the ?root? type from which it derived. I think that could apply to species as well as to value types but you have probably thought this through more than I have). 5) Adding Type information rather than Class information has a ripple effect on the JVM implementation - we need to study in more detail how this changes other constant pool entries such as StackMapTable etc. 6) Descriptor ambiguity We need to make sure that we design descriptors after we have figured out what a UType is. Descriptor matching (nominal or structural) works with exact matches. If you introduce a polymorphism that allows for multiple potential correct matches, you have to work out resolution, overriding and selection rules in great detail (and pay the performance cost). thanks, Karen > On Jun 9, 2017, at 11:43 AM, Dan Smith wrote: > >> On Jun 8, 2017, at 8:44 PM, John Rose wrote: >> >> The void quasi-type should be lumped into PrimitiveType, for the sake >> of ldc (void.class). > > I see the appeal, though it also, unfortunately, expands the set of "primitive types" and means we have to restrict that set at the use sites: > > CONSTANT_ArrayType_info { > u1 tag; // 21 > u2 component_index; // PrimitiveType **but not void**, ClassType, ArrayType, or SpeciesType > } > > CONSTANT_SpeciesType_info { > u1 tag; //22 > u1 mode_code; // 'L'=76 or 12, 'Q'=81 or 13 > u2 class_index; // Class > u2 enclosing_index; // ClassType or SpeciesType > u2 typearg_count; > u2 typeargs[typearg_count]; // PrimitiveType **but not void**, ClassType, ArrayType, or SpeciesType > } > > CONSTANT_MethodDescriptor_info { > u1 tag; // 23 > u2 parameter_count; > u2 parameter_descriptors[parameter_count]; // PrimitiveType **but not void**, ClassType, ArrayType, or SpeciesType > u2 return_descriptor; // PrimitiveType, ClassType, ArrayType, SpeciesType, or 0 (void) > } > > CONSTANT_FieldDescriptor_info { // is this wrapper useful? > u1 tag; // 24 > u2 type_index; // PrimitiveType **but not void**, ClassType, ArrayType, or SpeciesType > } > > - Any non-void type (CONSTANT_Class, CONSTANT_ArrayType, CONSTANT_ClassType, CONSTANT_SpeciesType, or CONSTANT_PrimitiveType **that isn't void**): > anewarray > verification_type_info.Object_variable_info > > - Any type or void (CONSTANT_Class, CONSTANT_ArrayType, CONSTANT_ClassType, CONSTANT_SpeciesType, or CONSTANT_PrimitiveType): > ldc > BootstrapMethods.bootstrap_arguments > > I prefer the discipline of making 'void' a separate entity (CONSTANT_Void?) that we don't necessarily call a "type", although not sure that carries its weight. > >> Note that very small integer constants >> (which would correspond to PT sub-tags) are *not* usually >> stored in the CP; they are loaded with short instructions >> like "bipush", not "ldc". > > Yes, and that works fine for instructions (see also newarray). The new requirement here is for another constant pool entry to need to talk about one of these very small things, and in a polymorphic way (e.g., the component type of an array may be a primitive or some other type). > > ?Dan From karen.kinnear at oracle.com Tue Jun 13 21:26:34 2017 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Tue, 13 Jun 2017 17:26:34 -0400 Subject: Draft of spec for Minimal Value Types In-Reply-To: <48EE3EAE-AB12-4742-9B1E-6E0D97B104B7@oracle.com> References: <48EE3EAE-AB12-4742-9B1E-6E0D97B104B7@oracle.com> Message-ID: Dan, Many thanks for writing this optional JVMS draft so early so we can iron out issues together. I wanted to follow up specifically on the load/link/init relationships for the Value Capable Class (VCC) and the derived Value Class (DVC) to use the terms in this JVMS draft. (Note: direct value class is the longer term directly defined value class which I have been calling a Valhalla Value Type VVT) I think we are all in agreement that a reference to a DVC must first pre-load the VCC just as it has to pre-load supertypes. The question arises about linking and initialization. So to clarify, the DVC does not have any methods, including today, and does not have any statics. So linking of the DVC itself does nothing. So initialization of the DVC itself does nothing. I think there are two models we would use here. Option 1: super-type model for root class relative to derived class: pre-link and pre-init VCC when linking or initialization the DVC Conceptually I think of a DVC and VCC as sharing one set of statics, and a value class instance today as a ?copy? of the instance fields of a VCC instance. So I think people would expect that the statics of the root class were initialized before any instance was created or operated on. And you would expect to link (e.g. verify) the root class before you let someone create or play with a derived class. If we did this in a different order, we could be operating on value types derived from invalid class files. Longer-term it is not clear if we will have a root class and a derived class, or conceptually one class file with two derived classes, but I believe the expectation is that there will continue to be one set of statics, so I would expect the statics to need to be initialized before either derived class created an instance. Longer-term it is expected that we will have a single source file with methods that must be verified before either class can be used. Option 2: lazy initialization, lazy linking Alternatively we could not initialize the VCC until any of the current instructions either reference a static or create an instance. Note: I would expect vbox to be added to the instructions requiring initialization in this case since it creates an instance of the VCC Even in this case I would continue to require initialization and linking according to the rules you state, e.g. adding initialization based on vdefault, anewarray, multianewarray even if they do nothing other than a state change. I do not know in this case how to handle verification errors in the VCC - i.e. are you still free to operate on the DVC? What happens when you try to vbox? Detailed question: In JVMS 5.5 Initialization in your draft - is it intentional that for anewarray and multianewarray that you mention a direct value class type - which in your terminology I believe is the future ?valhalla value type? which is directly defined but not derived from a VCC. So that you would not trigger initialization for these instructions for a derived value class? Would that be the same for vdefault also then? thanks, Karen > On Jun 7, 2017, at 2:43 AM, Dan Smith wrote: > > Please see the following for a set of changes to JVMS to support our value types prototyping efforts. > > http://cr.openjdk.java.net/~dlsmith/values.html > > The intent is to reference this document in an umbrella JSR as a set of features that may be optionally implemented by a JVM without any compatibility promises for future versions. (This is in the spirit of Incubator Modules, JEP 11.) > > Some details are still being ironed out, but I wanted to share with a broader audience. Feedback is welcome! > > Thanks, > Dan From daniel.smith at oracle.com Tue Jun 13 22:51:27 2017 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 13 Jun 2017 16:51:27 -0600 Subject: Draft of spec for Minimal Value Types In-Reply-To: References: <48EE3EAE-AB12-4742-9B1E-6E0D97B104B7@oracle.com> Message-ID: <10B5E48C-EE0C-4F73-AB64-AD9C907C2258@oracle.com> > On Jun 13, 2017, at 3:26 PM, Karen Kinnear wrote: > > I wanted to follow up specifically on the load/link/init relationships for the Value Capable Class (VCC) and the derived Value Class (DVC) to use the terms in this JVMS draft. (Note: direct value class is the longer term directly defined value class which I have been calling a Valhalla Value Type VVT) > Detailed question: > In JVMS 5.5 Initialization in your draft - is it intentional that for anewarray and multianewarray that you mention a direct value class type > - which in your terminology I believe is the future ?valhalla value type? which is directly defined but not derived from a VCC. So that > you would not trigger initialization for these instructions for a derived value class? > Would that be the same for vdefault also then? I think you're misunderstanding my use of "direct" -- I mean "non-reference" (as opposed to "reference value", which is a pointer). A "value class" is a class with ACC_VALUE set. The only way to get one of those, per current spec, is by deriving it from a VCC, and, sure, "derived value class" is an appropriate term. A "direct value class type" is the Q type of a value class, whether that class is derived or otherwise (were "otherwise" a possibility). > I think we are all in agreement that a reference to a DVC must first pre-load the VCC just as it has to pre-load supertypes. > > The question arises about linking and initialization. > > So to clarify, the DVC does not have any methods, including today, and does not have any statics. > So linking of the DVC itself does nothing. > So initialization of the DVC itself does nothing. > > I think there are two models we would use here. > > Option 1: super-type model for root class relative to derived class: pre-link and pre-init VCC when linking or initialization the DVC > > Conceptually I think of a DVC and VCC as sharing one set of statics, and a value class instance today as a ?copy? of the instance fields of a VCC instance. > Longer-term it is not clear if we will have a root class and a derived class, or conceptually one class file with two derived > classes, but I believe the expectation is that there will continue to be one set of statics, so I would expect the statics to need > to be initialized before either derived class created an instance. > > Longer-term it is expected that we will have a single source file with methods that must be verified before either class can > be used. If there's one set of statics, I would say there is one class. This approach seems consistent with a model in which we eliminate "Foo$Value" as a class name and just have reference and value flavors of "Foo" instead. At that point, I would say that we only have one thing to load, link, and initialize, and resolution of Q types should trigger that just like resolution of L types. > Option 2: lazy initialization, lazy linking > > Alternatively we could not initialize the VCC until any of the current instructions either reference a static or create an instance. > Note: I would expect vbox to be added to the instructions requiring initialization in this case since it creates an instance of the VCC > > Even in this case I would continue to require initialization and linking according to the rules you state, e.g. adding initialization > based on vdefault, anewarray, multianewarray even if they do nothing other than a state change. > > I do not know in this case how to handle verification errors in the VCC - i.e. are you still free to operate on the DVC? > What happens when you try to vbox? I think this describes the approach I've tried to specify. You have to load the VCC before defining the DVC, but otherwise we're talking about two independent classes. Good point, 'vbox' should be on the list of instructions that require initialization of the VCC. Linking of the VCC would be subject to the general rules for linking (5.8): must happen sometime after loading, sometime before initialization. Errors occur at a point in the program that "might, directly or indirectly, require linkage". 'vbox' would be one such point in a program. ?Dan From karen.kinnear at oracle.com Wed Jun 14 15:44:55 2017 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 14 Jun 2017 11:44:55 -0400 Subject: Draft of spec for Minimal Value Types In-Reply-To: <10B5E48C-EE0C-4F73-AB64-AD9C907C2258@oracle.com> References: <48EE3EAE-AB12-4742-9B1E-6E0D97B104B7@oracle.com> <10B5E48C-EE0C-4F73-AB64-AD9C907C2258@oracle.com> Message-ID: <9D0A9F1F-2E20-40D4-A90B-C7A267D26CE0@oracle.com> Dan, Thank you for the responses. Summary - we are good with the current JVMS description of decoupling VCC and DVC initialization and linking as long as you add vbox to require initialization of the VCC. > On Jun 13, 2017, at 6:51 PM, Dan Smith wrote: > >> On Jun 13, 2017, at 3:26 PM, Karen Kinnear > wrote: >> >> I wanted to follow up specifically on the load/link/init relationships for the Value Capable Class (VCC) and the derived Value Class (DVC) to use the terms in this JVMS draft. (Note: direct value class is the longer term directly defined value class which I have been calling a Valhalla Value Type VVT) > >> Detailed question: >> In JVMS 5.5 Initialization in your draft - is it intentional that for anewarray and multianewarray that you mention a direct value class type >> - which in your terminology I believe is the future ?valhalla value type? which is directly defined but not derived from a VCC. So that >> you would not trigger initialization for these instructions for a derived value class? >> Would that be the same for vdefault also then? > > I think you're misunderstanding my use of "direct" -- I mean "non-reference" (as opposed to "reference value", which is a pointer). A "value class" is a class with ACC_VALUE set. The only way to get one of those, per current spec, is by deriving it from a VCC, and, sure, "derived value class" is an appropriate term. A "direct value class type" is the Q type of a value class, whether that class is derived or otherwise (were "otherwise" a possibility). My misunderstanding. Thank you for clearing up the terminology - I will make a new cheat sheet :-) > >> I think we are all in agreement that a reference to a DVC must first pre-load the VCC just as it has to pre-load supertypes. >> >> The question arises about linking and initialization. >> >> So to clarify, the DVC does not have any methods, including today, and does not have any statics. >> So linking of the DVC itself does nothing. >> So initialization of the DVC itself does nothing. >> >> I think there are two models we would use here. >> >> Option 1: super-type model for root class relative to derived class: pre-link and pre-init VCC when linking or initialization the DVC >> >> Conceptually I think of a DVC and VCC as sharing one set of statics, and a value class instance today as a ?copy? of the instance fields of a VCC instance. > >> Longer-term it is not clear if we will have a root class and a derived class, or conceptually one class file with two derived >> classes, but I believe the expectation is that there will continue to be one set of statics, so I would expect the statics to need >> to be initialized before either derived class created an instance. >> >> Longer-term it is expected that we will have a single source file with methods that must be verified before either class can >> be used. > > If there's one set of statics, I would say there is one class. This approach seems consistent with a model in which we eliminate "Foo$Value" as a class name and just have reference and value flavors of "Foo" instead. At that point, I would say that we only have one thing to load, link, and initialize, and resolution of Q types should trigger that just like resolution of L types. There is one set of statics. From an implementation standpoint - the hotspot JVM would like to keep the Foo$Value for Early Access since there would be too many changes internally to handle the transition from name/class loader pair to the triple of name/mode/class loader. We can explore this again after Early Access. > >> Option 2: lazy initialization, lazy linking >> >> Alternatively we could not initialize the VCC until any of the current instructions either reference a static or create an instance. >> Note: I would expect vbox to be added to the instructions requiring initialization in this case since it creates an instance of the VCC >> >> Even in this case I would continue to require initialization and linking according to the rules you state, e.g. adding initialization >> based on vdefault, anewarray, multianewarray even if they do nothing other than a state change. >> >> I do not know in this case how to handle verification errors in the VCC - i.e. are you still free to operate on the DVC? >> What happens when you try to vbox? > > I think this describes the approach I've tried to specify. You have to load the VCC before defining the DVC, but otherwise we're talking about two independent classes. > > Good point, 'vbox' should be on the list of instructions that require initialization of the VCC. > > Linking of the VCC would be subject to the general rules for linking (5.8): must happen sometime after loading, sometime before initialization. Errors occur at a point in the program that "might, directly or indirectly, require linkage". 'vbox' would be one such point in a program. thanks, Karen > > ?Dan From karen.kinnear at oracle.com Wed Jun 14 15:54:07 2017 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 14 Jun 2017 11:54:07 -0400 Subject: What's in a CONSTANT_Class? In-Reply-To: References: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> <559248EF-F86D-4DA9-8BBE-AC737C5AD792@oracle.com> Message-ID: <2788A8A0-DCB4-4AD0-A69B-EC1B08F4B68F@oracle.com> Update from hotspot implementation: We would like to request that for the MVT Early Access we keep the TEMPORARY CONSTANT_Class_info ?;Q?. This is far easier for us to implement (we have a prototype in progress) and we believe that it will be easier for bytecode generators to adopt - which will allow us to get more people trying MVT so we get more feedback. We would also like to keep the explicit separate name for the derived value class, so that from an implementation standpoint we are able to continue to use the name, class loader pair as a unique lookup. So the JVMS as proposed explicitly calls out 5.3 Creation and Loading that the derived value class has the name ClassName$Value. For Early Access we would like to keep this naming convention, stable across reboots, so people can generate byte codes that reference value types by name distinctly from their value capable class. thanks, Karen p.s. this will allow us time to do the longer-term exploration of where the class/type/constant pool forms should evolve From forax at univ-mlv.fr Wed Jun 14 16:22:37 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 14 Jun 2017 18:22:37 +0200 (CEST) Subject: What's in a CONSTANT_Class? In-Reply-To: <2788A8A0-DCB4-4AD0-A69B-EC1B08F4B68F@oracle.com> References: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> <559248EF-F86D-4DA9-8BBE-AC737C5AD792@oracle.com> <2788A8A0-DCB4-4AD0-A69B-EC1B08F4B68F@oracle.com> Message-ID: <1608503280.1910604.1497457357802.JavaMail.zimbra@u-pem.fr> Hi Karen, With my ASM Hat, both CONSTANT_Class_info ?;Q? and CONSTANT_ValueType_info that references an UTF8 are Ok for me. Weirdly, having a CONSTANT_Value_info that reference a CONSTANT_Class_info is little harder to implement because the implementation of ASM is sensitive to the number of levels of indirection (it's hardcoded to be 4, a constant method handle has 4 levels). On the longer term, I think that the spec of CONSTANT_Class should changed to accept a class descriptor and not a class name (which is not BTW because array are accepted in order to encode a method call to an array clone()). It will allow more sharing and unlike a class name, a class descriptor is an extensible format. >From the VM point of view, it's easy to know if a CONSTANT_Class is a descriptor or not, if it's a descriptor, the last character is a ';'. I also think that the bytecode version corresponding to 10 should requires that all CONSTANT_Class are encoded as class descriptor. regards, R?mi ----- Mail original ----- > De: "Karen Kinnear" > ?: "Dan Smith" > Cc: valhalla-spec-experts at openjdk.java.net > Envoy?: Mercredi 14 Juin 2017 17:54:07 > Objet: Re: What's in a CONSTANT_Class? > Update from hotspot implementation: > > We would like to request that for the MVT Early Access we keep the TEMPORARY > CONSTANT_Class_info ?;Q?. > > This is far easier for us to implement (we have a prototype in progress) and we > believe that it will be easier > for bytecode generators to adopt - which will allow us to get more people trying > MVT so we get more feedback. > > We would also like to keep the explicit separate name for the derived value > class, so that from an implementation > standpoint we are able to continue to use the name, class loader pair as a > unique lookup. > So the JVMS as proposed explicitly calls out 5.3 Creation and Loading that the > derived value class has the name ClassName$Value. > > For Early Access we would like to keep this naming convention, stable across > reboots, so people can generate byte codes > that reference value types by name distinctly from their value capable class. > > thanks, > Karen > > p.s. this will allow us time to do the longer-term exploration of where the > class/type/constant pool forms should evolve From john.r.rose at oracle.com Wed Jun 14 21:55:23 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 14 Jun 2017 14:55:23 -0700 Subject: What's in a CONSTANT_Class? In-Reply-To: <1608503280.1910604.1497457357802.JavaMail.zimbra@u-pem.fr> References: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> <559248EF-F86D-4DA9-8BBE-AC737C5AD792@oracle.com> <2788A8A0-DCB4-4AD0-A69B-EC1B08F4B68F@oracle.com> <1608503280.1910604.1497457357802.JavaMail.zimbra@u-pem.fr> Message-ID: <13674CD0-5DA9-4F06-A76A-CDCC3EE52042@oracle.com> On Jun 14, 2017, at 9:22 AM, Remi Forax wrote: > > With my ASM Hat, > both CONSTANT_Class_info ?;Q? and CONSTANT_ValueType_info that references an UTF8 are Ok for me. Between those two I prefer the first since it doesn't require a new CP tag. > Weirdly, having a CONSTANT_Value_info that reference a CONSTANT_Class_info is little harder to implement because the implementation of ASM is sensitive to the number of levels of indirection (it's hardcoded to be 4, a constant method handle has 4 levels). Interesting fact. Won't that have to change with condy? That allows bootstrap specifications to be recursive. > On the longer term, I think that the spec of CONSTANT_Class should changed to accept a class descriptor and not a class name (which is not BTW because array are accepted in order to encode a method call to an array clone()). > It will allow more sharing and unlike a class name, a class descriptor is an extensible format. [Flat strings won't take us there] Remi, flat strings don't go far enough. They are moderately extensible, and certainly accommodate new ground types like QFoo; and UFoo;, but there are two big problems. First, they suffer from combinatorial explosion (*less* sharingin flat strings) and second they incompletely support expression-holes which are required when we get to generics. We live with the combinatorial problems of method type descriptors, but I think that's a place we want to retreat from. (Look at the encoding of (Object,Object,Object)Object: The flatness requires repetition of the whole qualified name four times, just in this one descriptor.) When we go to parameterized types, ground types will have multiple levels of nesting, which turns the problem from quadratic to exponential. That that point it's more than today's irritant. You can patch this with repeat operators, but the natural format is a tree, which represents all subparts uniformly, rather than some as a defining use, and others as repeated uses. [String-tagged shallow trees] For non-ground generic types, a type string could to be something like a format string. (The format "hello, %s" has a string-typed hole.) In that case, the string doesn't give you everything you need; it must be joined by a vector of operands. At that point you've invented trees, and then the real question is whether tree nodes should be tagged by format strings (an infinite number of them) or by a handful of simple CP-style tags. I handled both these issues in Pack200 by with the CONSTANT_Signature CP type (present only in Pack200 archives), whose content is a format string (with N>=0 holes) plus an implicitly counted vector of (N) CP refs of type CONSTANT_Class. (Primitives are inlined.) For technical reasons the hole syntax, if any, must be different from either string format notations and Pack200 with future JVMs; I think it should be a simple period '.'. (For discussion signature meta-characters see my "Symbolic Freedom" manifesto ca. 2008.) For values+generics we'll probably want to look at an experimental design like this that uses string-tagged tree nodes. They are very compact (hence their use in Pack200). [Byte-tagged deep trees] But I think for ease of tooling we will end up with the other option, which is *more* tree nodes tagged by a very small finite set of CP-style tags. This is why I support designs like the ones Dan has been sketching. In that style of tree, a format string like "hello, %s" breaks down into nested AST (Append[Literal["hello, "],Param[]]). Instead of parsing the string to find holes, the holes are directly represented, along with every other part, in a strongly-typed AST tree. An advantage of Dan-style trees is they are more strongly normalizing. With the format-based trees you always have small types sliding inline into the format strings, or out as explicit nodes (for uses like ldc). The programmer's educated instincts prefer one way to say one thing, rather than many ways to say the same thing. Stronger normalization leads to better compactness and fewer bugs. [Constant inlining?] Dan-style trees *could* be made much more compact, comparable to format strings, by extending the CP to support inlining of constant expressions into other expressions. This weakens the strong normalization of constants, but at a lower level where it can be hidden; constants presented via tools like ASM can be normalized easily, with a single clever rule ("unwind the inlining by making temporary CP nodes"). ASM does stuff like this in reverse already, by interning ("normalizing") constants. We probably need something like this anyway, for the future CONSTANT_Group syntax, which doesn't pay for itself if it has to burn its way through the limited (u2) index space of the CP; so it needs some form of inlining, for constants that occur only inside the group and don't need global sharing. > From the VM point of view, it's easy to know if a CONSTANT_Class is a descriptor or not, if it's a descriptor, the last character is a ';'. Actually, for the proposed extension, you look at the *first* character to see if it is a ';'. It's a different place (already existing) in the system where you check to see whether the name is of the form Foo or "LFoo;", and strip the decorations in the latter case. You *could* get away with Class["QFoo;"] but I don't recommend it, because it's a little harder to decode for both human readers and parsers. > I also think that the bytecode version corresponding to 10 should requires that all CONSTANT_Class are encoded as class descriptor. If I understand what you are saying, that's not MVT at all, since it would force a revolution in tools. So we won't do that. It's overwhelmingly likely that legacy uses of CONSTANT_Class will coexist with new CP forms for multiple releases, even if this gives up the advantages of normal forms. [In the crystal ball] Beyond MVT, the CONSTANT_Class[";QFoo;"] wants to become either a Pack200 style thing like this: CONSTANT_Type[format="Q.;", args={ClassFile["Foo"]}] or (preferentially) a Dan-tree-like thing like this: CONSTANT_ClassType[mode=Q, class=ClassFile["Foo"]] ? John P.S. [Side notes on CP-ology] When I write CP AST I like to omit "CONSTANT_" from nested nodes, and elide Utf8 nodes completely around strings. Makes it almost like a real notation. Format-strings for CP nodes would use a previously-unusable character for a hole; the period '.' fits nicely and looks like a hole. The arity of the node is determined Pack200-style by counting the holes, or else with an explicit u2 field. Pack200 simply counts the 'L' characters, but this breaks down with 'Q' and 'U'. Note that both format-strings and proper trees supply a solution for the quadratic length of method type descriptors. Pack200 uses CONSTANT_Signature for both method and field types. CONSTANT_Signature["(L;L;L;)L;", Class("Object"), Class("Object"), Class("Object"), Class("Object")] # one child node mentioned 4x From john.r.rose at oracle.com Wed Jun 14 21:57:54 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 14 Jun 2017 14:57:54 -0700 Subject: What's in a CONSTANT_Class? In-Reply-To: <2788A8A0-DCB4-4AD0-A69B-EC1B08F4B68F@oracle.com> References: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> <559248EF-F86D-4DA9-8BBE-AC737C5AD792@oracle.com> <2788A8A0-DCB4-4AD0-A69B-EC1B08F4B68F@oracle.com> Message-ID: <918628DD-F215-43CA-A3B0-91CC0494F2C3@oracle.com> On Jun 14, 2017, at 8:54 AM, Karen Kinnear wrote: > > We would like to request that for the MVT Early Access we keep the TEMPORARY CONSTANT_Class_info ?;Q?. Nit: For uniformity, the syntax wants to be ";" + field_signature, which implies ";Q;". Without that uniformity you need to specify a third syntax (neither field nor method signature), which is not good spec. economy, even for a temporary feature. From forax at univ-mlv.fr Thu Jun 15 22:09:00 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 16 Jun 2017 00:09:00 +0200 (CEST) Subject: What's in a CONSTANT_Class? In-Reply-To: <13674CD0-5DA9-4F06-A76A-CDCC3EE52042@oracle.com> References: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> <559248EF-F86D-4DA9-8BBE-AC737C5AD792@oracle.com> <2788A8A0-DCB4-4AD0-A69B-EC1B08F4B68F@oracle.com> <1608503280.1910604.1497457357802.JavaMail.zimbra@u-pem.fr> <13674CD0-5DA9-4F06-A76A-CDCC3EE52042@oracle.com> Message-ID: <2095284512.2618584.1497564540123.JavaMail.zimbra@u-pem.fr> > De: "John Rose" > ?: "R?mi Forax" > Cc: "Karen Kinnear" , > valhalla-spec-experts at openjdk.java.net > Envoy?: Mercredi 14 Juin 2017 23:55:23 > Objet: Re: What's in a CONSTANT_Class? > On Jun 14, 2017, at 9:22 AM, Remi Forax < forax at univ-mlv.fr > wrote: >> With my ASM Hat, >> both CONSTANT_Class_info ?;Q? and CONSTANT_ValueType_info that references >> an UTF8 are Ok for me. > Between those two I prefer the first since it doesn't require a new CP tag. >> Weirdly, having a CONSTANT_Value_info that reference a CONSTANT_Class_info is >> little harder to implement because the implementation of ASM is sensitive to >> the number of levels of indirection (it's hardcoded to be 4, a constant method >> handle has 4 levels). > Interesting fact. Won't that have to change with condy? > That allows bootstrap specifications to be recursive. I have not implemented condy yet but i believe it's not an issue if the way to lookup a value is lazy, the user code will be recursive and not the ASM internals. >> On the longer term, I think that the spec of CONSTANT_Class should changed to >> accept a class descriptor and not a class name (which is not BTW because array >> are accepted in order to encode a method call to an array clone()). >> It will allow more sharing and unlike a class name, a class descriptor is an >> extensible format. > [Flat strings won't take us there] > Remi, flat strings don't go far enough. They are moderately > extensible, and certainly accommodate new ground types like > QFoo; and UFoo;, but there are two big problems. First, they > suffer from combinatorial explosion (*less* sharingin flat strings) > and second they incompletely support expression-holes which > are required when we get to generics. (BTW, neither you nor Karen did answer to my mail asking why we need UFoo; ) I agree that we may need a tree of constants but only if the interpreter need that in order to interpret the code. In my opinion, we should only use a tree of constants if it makes sense for the interpreter, otherwise, the constant should be flattened as a String. By example, an interpreter of Java 9 does not need to extract the types from a method descriptor in order to run (the verifier does but not the interpreter), so for me a method descriptor does not need to be a tree of constants at least for Java 9. > We live with the combinatorial problems of method type descriptors, > but I think that's a place we want to retreat from. (Look at the encoding > of (Object,Object,Object)Object: The flatness requires repetition of > the whole qualified name four times, just in this one descriptor.) > When we go to parameterized types, ground types will have multiple > levels of nesting, which turns the problem from quadratic to > exponential. That that point it's more than today's irritant. > You can patch this with repeat operators, but the natural format > is a tree, which represents all subparts uniformly, rather than some > as a defining use, and others as repeated uses. I fully agree. Specializing the code should not require to patch constants of the constant pool. The patchable content should be represented by an index inside a tree and the interpreter should maintain an array (in fact two arrays because you have method parameter types and class parameter types) of the corresponding type arguments. > [String-tagged shallow trees] > For non-ground generic types, a type string could to be something > like a format string. (The format "hello, %s" has a string-typed hole.) > In that case, the string doesn't give you everything you need; > it must be joined by a vector of operands. At that point you've > invented trees, and then the real question is whether tree nodes > should be tagged by format strings (an infinite number of them) > or by a handful of simple CP-style tags. > I handled both these issues in Pack200 by with the CONSTANT_Signature > CP type (present only in Pack200 archives), whose content is a format > string (with N>=0 holes) plus an implicitly counted vector of (N) CP refs > of type CONSTANT_Class. (Primitives are inlined.) For technical > reasons the hole syntax, if any, must be different from either string > format notations and Pack200 with future JVMs; I think it should be > a simple period '.'. (For discussion signature meta-characters see > my "Symbolic Freedom" manifesto ca. 2008.) > For values+generics we'll probably want to look at an experimental design > like this that uses string-tagged tree nodes. They are very compact (hence > their use in Pack200). The problem of String tagging is that you often need to allocate a new string when you do the replacement or you have to be clever (and play with stack allocated iterators), but having a real tree may be worst if the interpreter data structure and the tree of constant in the constant pool do not match. About the compactness, being easier to interpret is in my opinion more important than be more compact. > [Byte-tagged deep trees] > But I think for ease of tooling we will end up with the other option, > which is *more* tree nodes tagged by a very small finite set of > CP-style tags. This is why I support designs like the ones > Dan has been sketching. > In that style of tree, a format string like "hello, %s" breaks down into > nested AST (Append[Literal["hello, "],Param[]]). Instead of parsing > the string to find holes, the holes are directly represented, along > with every other part, in a strongly-typed AST tree. > An advantage of Dan-style trees is they are more strongly normalizing. > With the format-based trees you always have small types sliding inline > into the format strings, or out as explicit nodes (for uses like ldc). > The programmer's educated instincts prefer one way to say one > thing, rather than many ways to say the same thing. Stronger > normalization leads to better compactness and fewer bugs. yes > [Constant inlining?] > Dan-style trees *could* be made much more compact, comparable > to format strings, by extending the CP to support inlining of constant > expressions into other expressions. This weakens the strong normalization > of constants, but at a lower level where it can be hidden; constants > presented via tools like ASM can be normalized easily, with a single > clever rule ("unwind the inlining by making temporary CP nodes"). > ASM does stuff like this in reverse already, by interning ("normalizing") > constants. > We probably need something like this anyway, for the future > CONSTANT_Group syntax, which doesn't pay for itself if it has to > burn its way through the limited (u2) index space of the CP; so it > needs some form of inlining, for constants that occur only inside > the group and don't need global sharing. >> From the VM point of view, it's easy to know if a CONSTANT_Class is a descriptor >> or not, if it's a descriptor, the last character is a ';'. > Actually, for the proposed extension, you look at the *first* character to see > if it is a ';'. It's a different place (already existing) in the system where > you > check to see whether the name is of the form Foo or "LFoo;", and strip > the decorations in the latter case. You *could* get away with Class["QFoo;"] > but I don't recommend it, because it's a little harder to decode for both > human readers and parsers. i do not understand why ? >> I also think that the bytecode version corresponding to 10 should requires that >> all CONSTANT_Class are encoded as class descriptor. > If I understand what you are saying, that's not MVT at all, since it > would force a revolution in tools. So we won't do that. It's overwhelmingly > likely that legacy uses of CONSTANT_Class will coexist with new > CP forms for multiple releases, even if this gives up the advantages > of normal forms. yes,, it's post MVT and given there will be other changes in the constant pool, tools will need to be updated so we can also mandate CONSTANT_Class to use only the descriptor format at at time. > [In the crystal ball] > Beyond MVT, the CONSTANT_Class[";QFoo;"] wants to become either > a Pack200 style thing like this: > CONSTANT_Type[format="Q.;", args={ClassFile["Foo"]}] > or (preferentially) a Dan-tree-like thing like this: > CONSTANT_ClassType[mode=Q, class=ClassFile["Foo"]] i disagree, i think we should limit ourselves to use tree of constants only when we need type substitution, Constant_Class("QFoo;") get you enough information, and if the format is normalized to be always descriptor, you can easily disambiguate using the first character. > ? John R?mi > P.S. [Side notes on CP-ology] > When I write CP AST I like to omit "CONSTANT_" from > nested nodes, and elide Utf8 nodes completely around strings. > Makes it almost like a real notation. > Format-strings for CP nodes would use a previously-unusable > character for a hole; the period '.' fits nicely and looks like a hole. > The arity of the node is determined Pack200-style by counting > the holes, or else with an explicit u2 field. Pack200 simply counts > the 'L' characters, but this breaks down with 'Q' and 'U'. > Note that both format-strings and proper trees supply a solution > for the quadratic length of method type descriptors. Pack200 > uses CONSTANT_Signature for both method and field types. > CONSTANT_Signature["(L;L;L;)L;", Class("Object"), Class("Object"), > Class("Object"), Class("Object")] # one child node mentioned 4x From john.r.rose at oracle.com Fri Jun 16 02:18:31 2017 From: john.r.rose at oracle.com (John Rose) Date: Thu, 15 Jun 2017 19:18:31 -0700 Subject: What's in a CONSTANT_Class? In-Reply-To: <2095284512.2618584.1497564540123.JavaMail.zimbra@u-pem.fr> References: <3BE97418-0B14-4322-9D74-75D95F6C19AB@oracle.com> <03B61FDE-7B70-4ABE-A320-56F20374BF8B@oracle.com> <559248EF-F86D-4DA9-8BBE-AC737C5AD792@oracle.com> <2788A8A0-DCB4-4AD0-A69B-EC1B08F4B68F@oracle.com> <1608503280.1910604.1497457357802.JavaMail.zimbra@u-pem.fr> <13674CD0-5DA9-4F06-A76A-CDCC3EE52042@oracle.com> <2095284512.2618584.1497564540123.JavaMail.zimbra@u-pem.fr> Message-ID: <493FD5BE-E29E-4A87-94AC-81351FC051C4@oracle.com> On Jun 15, 2017, at 3:09 PM, forax at univ-mlv.fr wrote: > > > (BTW, neither you nor Karen did answer to my mail asking why we need UFoo; ) I'm working on a manifesto about this. Short answer is Q-Foo and L-Foo need to be disjoint types, so that interpreter processing of L-Foo is undisturbed and Q-Foo can used non-heap buffering. But, some language features require a *union* of Q-Foo and L-Foo. That could be Q-MaybeRef but it is so fundamental to translation strategies that it seems to merit a new type-kind. It can be the final one, since it is a disjoint union containing all values of the other type-kinds (L,Q,I,J,F,D). To soften the blow, we can then align Q and U type-kinds to use exactly the same representation, so that in practice verified Q-values are carried using the same format as verified U-values (of the same class) but with a little less exercise of the power of the carrier (refs and nulls don't appear). The motivating uses of U-types are any-kinded type parameters and interfaces which are implemented by values. Once accepted, there are many serendipitous uses of U-types that arise, including use cases where you want "Q-Foo or null" (that's U-Foo) or "Foo and I don't want to know if it was boxed or not" (U-Foo again). Finally, putting U-types in the heap, if we get that far, gives us frozen arrays and frozen objects "for free". Trust me, we spent many months trying to find a way to implement type-vars and interfaces without U-types and the alternatives were all worse. (Always box for interface calls, but no heisenboxes? No. Always box any-type vars? Also no. Always specialize any-generic code at bytecode level? No, no, no. Use U-types for operands in generic algos and interface defaults? Yes!) > > I agree that we may need a tree of constants but only if the interpreter need that in order to interpret the code. > In my opinion, we should only use a tree of constants if it makes sense for the interpreter, otherwise, the constant should be flattened as a String. There are other reasons to avoid strings, notably better type checking. Also footprint, if the complexity exponent goes above 2 (as I already argued). > > You can patch this with repeat operators, but the natural format > is a tree, which represents all subparts uniformly, rather than some > as a defining use, and others as repeated uses. > > I fully agree. Specializing the code should not require to patch constants of the constant pool. > The patchable content should be represented by an index inside a tree and the interpreter should maintain an array (in fact two arrays because you have method parameter types and class parameter types) of the corresponding type arguments. > (Another manifesto I'm working on!) IMO, enhancing constant pools so they can be smoothly parameterized is Job One for VM support of extended generics. > > Actually, for the proposed extension, you look at the *first* character to see > if it is a ';'. It's a different place (already existing) in the system where you > check to see whether the name is of the form Foo or "LFoo;", and strip > the decorations in the latter case. You *could* get away with Class["QFoo;"] > but I don't recommend it, because it's a little harder to decode for both > human readers and parsers. > > i do not understand why ? We read a prefix first so it sets a context, and then we read the rest. We human readers are used to this. It's like backslash superquoting. If you don't know what you are reading until you read the end, then you have to read it twice. For a computer, the flag is at foo[0] rather than foo[foo.length-1], which is a little less complex. And putting the flag at the beginning allows the computer to stream over the text, choosing a parser up front, and then like the human reader it can "read the rest" with confidence. Streaming over small strings like this has no performance advantage, but streaming code is easier to understand and reason about, hence less buggy. There's also my prejudice, to be frank. I've always been annoyed by the obtuseness of the check sym[0]=='L' && sym[sym.length-1]==';'. That sort of oddity breeds bugs, compared to a left-to-right parse. > > > If I understand what you are saying, that's not MVT at all, since it > would force a revolution in tools. So we won't do that. It's overwhelmingly > likely that legacy uses of CONSTANT_Class will coexist with new > CP forms for multiple releases, even if this gives up the advantages > of normal forms. > > yes,, it's post MVT and given there will be other changes in the constant pool, tools will need to be updated so we can also mandate CONSTANT_Class to use only the descriptor format at at time. Those mandates are harder to pull off than they look beforehand. (Who knew modules would be so excruciatingly hard to "mandate"?) I think we'll have to do a gentle introduction with peaceful coexistence for at least a couple years. > > [In the crystal ball] > > Beyond MVT, the CONSTANT_Class[";QFoo;"] wants to become either > a Pack200 style thing like this: > > CONSTANT_Type[format="Q.;", args={ClassFile["Foo"]}] > > or (preferentially) a Dan-tree-like thing like this: > > CONSTANT_ClassType[mode=Q, class=ClassFile["Foo"]] > > i disagree, i think we should limit ourselves to use tree of constants only when we need type substitution, > Constant_Class("QFoo;") get you enough information, and if the format is normalized to be always descriptor, you can easily disambiguate using the first character. The repetition of the string "Foo" in two constants is a failure of normalization. This is not just theoretical: If we have one ClassFile["Foo"] in the whole CP for getting at "Foo.class", then we have a clear (non-buggy) algorithm for making queries about the contents of "Foo.class". This is something the VM team has asked for, if I'm not mistaken. (Karen?) So, more-lighter trees are not just for the interpreter, not just for type variables, but have a number of properties which make them (on balance) better than fewer-heavier trees, or just strings. ? John From daniel.smith at oracle.com Fri Jun 16 06:00:14 2017 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 16 Jun 2017 00:00:14 -0600 Subject: Value class spec updates Message-ID: Please see the following JVM spec documents: [1] A minor update to the previously-shared document?changes detailed below. [2] An adjustment to the Value Classes spec introducing CONSTANT_ClassType to represent direct value class types. [3] An adjustment to the Value Classes spec to allow value classes to be declared explicitly, rather than derived. Changes to [1]: - Allowed vbox to trigger initialization of the VCC. - Reordered runtime exceptions of vunbox (NPE, not ICCE, if the input is null). - Added some minor changes to account for the broader domain of CONSTANT_Class strings in 4.7.3, 4.7.5, 4.10.1.3, and 'new'. On my TODO list: - Investigate exactly which steps should be taken when loading a $Value class triggers loading of a VCC - Clean up verification rules to enforce the 4.9.1 static constraints (and maybe move some format checking/static constraints to resolution time?) [2] and [3] are not necessarily part of MVT, but are natural next steps that we have discussed recently. [2] sets us up for introducing as many type structures as we want and moving away from using CONSTANT_Class to represent types (while continuing to provide legacy support). [3] seems the best path towards dropping the "$Value" convention and, ultimately, having a Q/L pair of types pointing to a single class name. Not sure whether [3] is a viable stopping point for tools and APIs, though, without taking another step to add support for boxed L types. ?Dan [1] http://cr.openjdk.java.net/~dlsmith/values.html [2] http://cr.openjdk.java.net/~dlsmith/values-classtype.html [3] http://cr.openjdk.java.net/~dlsmith/values-declaration.html From john.r.rose at oracle.com Tue Jun 20 01:28:28 2017 From: john.r.rose at oracle.com (John Rose) Date: Mon, 19 Jun 2017 18:28:28 -0700 Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> Message-ID: On Jun 9, 2017, at 1:51 PM, John Rose wrote: > > We have talked about "condy" (a constant-pool friend for indy). > Enclosed is the javadoc portion of a draft spec. > The JVM spec. is not ready and will be sent separately. > This javadoc reflects code checked into the condy branch of Amber. I have updated the javadoc portion of the spec for constant-dynamic. There are small but strategic changes to MethodType and MethodHandle, as well as new classes to represent the "pull mode" of constant resolution. The key changes relevant to the JVM are in package-info.html, and (of course) in the forthcoming JVM spec. changes. http://cr.openjdk.java.net/~jrose/jvm/specdiff-condy-2017-0619.zip (I've given up sending attachments for now since the server scrubs them.) ? John From john.r.rose at oracle.com Wed Jun 21 08:46:22 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 21 Jun 2017 01:46:22 -0700 Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> Message-ID: On Jun 19, 2017, at 6:28 PM, John Rose wrote: > > I have updated the javadoc portion of the spec for constant-dynamic. > There are small but strategic changes to MethodType and MethodHandle, > as well as new classes to represent the "pull mode" of constant resolution. > The key changes relevant to the JVM are in package-info.html, > and (of course) in the forthcoming JVM spec. changes. > > http://cr.openjdk.java.net/~jrose/jvm/specdiff-condy-2017-0619.zip The forthcoming has come forth. Here is a semi-formatted and color-annotated diff of the JVMS outlining proposed support for dynamic constants, as well as some enhancements to the processing of bootstrap methods: http://cr.openjdk.java.net/~jrose/jvm/condy-jvms-2017-0620.html ? John From karen.kinnear at oracle.com Wed Jun 21 14:19:32 2017 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 21 Jun 2017 10:19:32 -0400 Subject: minutes Valhalla EG June 07, 2017 In-Reply-To: References: Message-ID: Dan - I can?t find it either. We?ll have to ask John. thanks, Karen > On Jun 21, 2017, at 9:55 AM, Daniel Heidinga wrote: > > > AI: John - send out javadoc to EG > > derived class := Class.derivedClassFactory(Class mainClass, T userData, String name) > > In the spirit of the usual "5 min before the meeting" ritual action item panic, I'm trying to review the javadoc for this and can't seem to find it. Can it be sent again? > > Thanks, > --Dan > > ----- Original message ----- > From: Karen Kinnear > Sent by: "valhalla-spec-experts" > To: valhalla-spec-experts at openjdk.java.net > Cc: > Subject: minutes Valhalla EG June 07, 2017 > Date: Wed, Jun 7, 2017 4:28 PM > > Valhalla EG Minutes June 07, 2017 > > attendees: Bjorn, Dan H, Dan S, John, Vlad, Frederic, Lois, Brian, Maurizio, Karen > > AI ALL: > Dan Smith sent an initial draft of a JVMS with experimental support for MVT for review. > Feedback in email requested - sooner rather than later please. > AI ALL: > Review embedded proposal for issue 1 - John javadoc to avoid exposing internal derived value type name > Review embedded proposal for EA for handling CONSTANT_Class > > Timing note: > Value type exploration is following three timeframes: > Minimal Value Types Early Access (EA) - goal: ASAP so we can get feedback from initial users > Minimal Value Types (MVT) - goal: w/JDK10 for much broader feedback > Valhalla Value Types - "real" vs. shady values - much richer feature set > Some of the issues we are exploring - such as type vs. class will need to evolve, so we need > to reach decisions on our initial EA stake in the ground ASAP. > For that - review of and conclusions to JVMS and other open issues is needed. > > Issue 1: Exposure of mirror and mirror name for the value class > Bjorn: (please correct any inaccuracies) > IBM implementation does NOT expose the value type mirror name > ValueType.valueClass is the only way to get the value type mirror > getClassName returns the same answer > 2 java objects, same underlying data > no internal derived value type name is exposed > > John: proposal for breaking the link to the secondary mirror > Model is that there is one primary mirror and multiple secondary mirrors > Brian: one nominal class and multiple derived classes analogous to a DirectMethodHandle and derived > MethodHandles > Later reflection couild add APIs at the java level to get the secondary mirrors > - has an initial proposal in which you pass in > head class, user data (e.g. value type descriptor), user-chosen name > name is not resolvable, doesn't work for findClass, but visible when reflecting > > Dan H: do we need to ensure user name/user data consistent? That has been an issue in related APIs? > John: no > Karen: assume we can not use this name to look up a class (forName)? just for reflection to print? > John: not for lookup > > Maurizio: this could be useful today (i.e. for EA) for a value class > Issue: Reflection behavior for EA > Karen: we already agreed reflection will not work - will throw an exception > Maurizio: it could be actually easier to use John's factory than to throw an exception > > Timing: > AI: John - send out javadoc to EG > derived class := Class.derivedClassFactory(Class mainClass, T userData, String name) > All: evaluate proposal both for doability > also evaluate for timing: EA vs. MVT? > > > Issue 2: Constant Pool representation for derived value type (JVMS term: value class) > > Goals: > 1. cache point for usage - need separate storage for DVT and VCC > 2. prefer not to do string parsing over and over to get the mode > 3. verifier ensure type safety without additional eager class loading > 4. ensure single resolution of underlying value-capable-class > (longer-term want single resolution of underlying source classfile) > 5. allow implementations to support older classfiles > 6. tool support - make sure this works for a mix of constant pool changes > e.g. tools that do not know about new versions still instrument new classfiles > - need to make sure these still work as much as possible > - so for these folks we need to not change the meaning of CONSTANT_Class > 7. future - make sure the model works for future derivation from more than one type > - e.g. Foo > 7a. request that for a Parameterized Type: this_class (name and CONSTANT_Class today) > allows lazy resolution of the list > (ed. note: need to discuss details of "lazy" here - loading the class file perhaps, > but instantiating a type from it will need the parameterizations, so far we have > conceptually recorded the loaded class file under the "head" type, with default/erased > parameterizations) > 8. upside opportunity: Constable and pattern matching - helpful if all class objects > were represented the same way when generating bytecode > e.g. int.class vs. Integer.class require different handling today > 9. migration: a class should be able to migrate to being a value type > approach: will require boxing to access, but if you pass for example a boxed value type > the current client should continue to work > 10. migration: value type to reference? Open question > > 11. ed. note: we did not mention that for MVT we actually have multiple source classfiles and > at least one potential prototype for parameterized types also generates separate classfiles. > While we strongly do not want to build in the concept of multiple separate classfiles, it > would be valuable if the constant pool representation was able to support that. > This might help extend to nested classes as well. > > John: > > two views of CONSTANT_Class: > 1. is it a type? then need nested CONSTANT_ClassFile: Class[ClassFile] > 2. is it a loaded class file? then need surrounding decoration Type[Class] > bad third choice: 3. use separately resolved peers Class["name.1"], Class["name.2"] where name is mangled but > refers to same loaded class file > > Today: CONSTANT_Class represents both the type and the loaded class file. > > Dan S: > Prefers option 1: type with a reference to a raw classfile > (ed. note - Dan S - I didn't get any details on why you prefer this for longer-term, it would help > to understand) > (ed. note - Dan S - can you add some notes on how to represent goal #11?) > > Bjorn: > Are there any cases in which a classfile does not represent a reference type? > Valhalla - goal is that the classfile represents both value type and the boxed value type > > Proposal from John/Karen: > CONSTANT_Class_info: used for both the "head" LType and the classfile - conflate for backward compatibility > CONSTANT_Value_info: mode information and references the underlying CONSTANT_Class_info > Parameterized type: has auxiliary info and references at least on underlying (head) CONSTANT_Class_info > e.g. List would reference List class, and Foo would linked to in auxiliary info > > AI: All > Explore this potential model: > Would this make sense from a JVMS perspective? > Would this work for JVM implementations? > Would this work for bytecode generation etc? > - please check if this is feasible short-term, i.e. for EA > - also explore if we could "flop" to #1 later > > Feedback: > John: short term: Try the proposal above with CONSTANT_Value_info refering to an underlying CONSTANT_Class_info > Spec direction is likely to be CONSTANT_Class references a Classfile > Lois: ok with verifier > Frederic: ok with bytecodes > Bjorn: prefer #3 ;Q - for MVT - it has minimal impact > ok with #2, but the change wouldn't slow us down much > Frederic: todwy - handle by using separate opcodes and all have CONSTANT_Class > (ed. note - and we haven't implemented the verifier which would like to sanity check > that the CONSTANT_Class matches the expected type in the opcode) > > > --- > 3. Ways to phase out current classfile capabilities such as CONSTANT_Class? Or LDC CONSTANT_Class? > Brian: jigsaw added the (not so popular) concept of runtime warnings > Maurizio - jsr/ret deprecated (John: ACC_SUPER) - not used much - javac was primary client at the time > StackMapTable added - lots of complaints > > What if long-term we were to derive arrays using multi-level constant pool entries > What if we were to support Q[ as well as L arrays? > - immutable, non-nullable, identityless > > John: What if we were to evolve CONSTNAT_MethodType - currently a flat string to use tree structured approach? > > Dan H: limited numbers of places we parse descriptors > Karen: don't slow down our resolution method and field lookup > Dan H: limited to resolution time > Dan S: might be faster to look up using tree comparison vs. extra long symbols > ed. note: worth exploring - just be really sure there is no potential for ambiguity, > i.e. at any resolution step: do you have a match for X or Y > > --- > 4. Box and value implementation relationships > goal: reduce costs when boxing/unboxing > -e.g. same layout alignment for fields > - what if we could just change 1 bit in the carrier > > --- > 5. Opcode proposal: > drop vgetfield, overload getfield instead > Bjorn: concern - not want performance impact on existing opcodes > John: propose discard the extra defined opcode and leave room for quickening > > overload: anewarray, multianewarray > > Chat room notes on details: > John: > uses of C_Class as a class-file: this_class, PType head-type (template) uses of C_Class as a type: ldc, > ?Field/Methodref? > Dan S: > Arbitrary types allowed: anewarray, multianewarray, structural descriptor, ldc, bootstrap argument, > verification_type_info, maybe field/method refs > Reference types allowed: checkcast, instanceofReference > class/species allowed: super_class, interfaces, new, annotation, catch_type, Exceptions > Plain class required: this_class, InnerClasses, EnclosingMethod, maybe field/method refs > Maurizio: > was thinking the recently that InnerClasses is probably another 'classfile' use > John: > Agree > ldc is not arbitrary type, b/c ldc int.class not possible; ldc is L-type only > > > > From forax at univ-mlv.fr Wed Jun 21 15:00:12 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 21 Jun 2017 17:00:12 +0200 (CEST) Subject: Meeting today ? Message-ID: <2109507833.1734696.1498057212041.JavaMail.zimbra@u-pem.fr> Hi all, does somebody can send me the correct URL for the meeting today ? cheers, R?mi From karen.kinnear at oracle.com Wed Jun 21 15:01:40 2017 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 21 Jun 2017 11:01:40 -0400 Subject: Meeting today ? In-Reply-To: <2109507833.1734696.1498057212041.JavaMail.zimbra@u-pem.fr> References: <2109507833.1734696.1498057212041.JavaMail.zimbra@u-pem.fr> Message-ID: <87C00CEC-8588-40B3-AB45-498E93B40B95@oracle.com> https://oracle.zoom.us/j/251372518 > On Jun 21, 2017, at 11:00 AM, Remi Forax wrote: > > Hi all, > does somebody can send me the correct URL for the meeting today ? > > cheers, > R?mi From forax at univ-mlv.fr Wed Jun 21 15:46:06 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 21 Jun 2017 17:46:06 +0200 (CEST) Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> Message-ID: <1347399309.1758320.1498059966359.JavaMail.zimbra@u-pem.fr> Nice work John, but i do not like with this proposal as is, i will explain why and how to fix it: - condy is linked to a static final field but unlike invokedynamic which is a link from an invokedynamic instruction to a CONSTANT_InvokeDynamic_info, there is no link from the static final field to the CONSTANT_ConstantDynamic_info. Why not reuse the ConstantValue attribute [1] to reference the CONSTANT_ConstantDynamic_info instead (the constantvalue_index can be extended to allow a CONSTANT_ConstantDynamic). - condy if a 'dy' like indy, so it should do late late binding, i.e. being initialized (run the bootstrap method) only the first time someone access to the static field exactly like with indy the bsm is called the first time you try to access the instruction. In term of semantics, my proposal does not introduce an item in the constant pool which is resolved only by the virtue of being in the constant pool unlike any other items. If condy is linked to the ConstantValue of a field, the condy item is resolved when necessary as usual. With my ASM hat, i see how to implement it easily without having to surface the constant pool itself (at least until the items are pointed by the j.l.i.BootstrapCallInfo). cheers, R?mi [1] https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.7.2 ----- Mail original ----- > De: "John Rose" > ?: valhalla-spec-experts at openjdk.java.net > Envoy?: Mercredi 21 Juin 2017 10:46:22 > Objet: Re: notes from Valhalla meeting 5/24/17 > On Jun 19, 2017, at 6:28 PM, John Rose wrote: >> >> I have updated the javadoc portion of the spec for constant-dynamic. >> There are small but strategic changes to MethodType and MethodHandle, >> as well as new classes to represent the "pull mode" of constant resolution. >> The key changes relevant to the JVM are in package-info.html, >> and (of course) in the forthcoming JVM spec. changes. >> >> http://cr.openjdk.java.net/~jrose/jvm/specdiff-condy-2017-0619.zip >> > > The forthcoming has come forth. Here is a semi-formatted and > color-annotated diff of the JVMS outlining proposed support for > dynamic constants, as well as some enhancements to the processing > of bootstrap methods: > > http://cr.openjdk.java.net/~jrose/jvm/condy-jvms-2017-0620.html > > ? John From john.r.rose at oracle.com Wed Jun 21 15:51:11 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 21 Jun 2017 08:51:11 -0700 Subject: minutes Valhalla EG June 07, 2017 In-Reply-To: References: Message-ID: <4E7249A7-DC7B-497C-B79E-D1471A934847@oracle.com> On Jun 21, 2017, at 6:55 AM, Daniel Heidinga wrote: > > > AI: John - send out javadoc to EG > > derived class := Class.derivedClassFactory(Class mainClass, T userData, String name) > > In the spirit of the usual "5 min before the meeting" ritual action item panic, I'm trying to review the javadoc for this and can't seem to find it. Can it be sent again? > Apologies; here it is, 10 min before the meeting. diff --git a/src/java.base/share/classes/java/lang/Class.java b/src/java.base/share/classes/java/lang/Class.java --- a/src/java.base/share/classes/java/lang/Class.java +++ b/src/java.base/share/classes/java/lang/Class.java @@ -681,6 +681,65 @@ public native boolean isPrimitive(); /** + * Determines if the specified {@code Class} object is the + * primary representative of an underlying class file. + * Array and primitive classes are never primaries. + * Other reference type constants of the form {@code X.class} + * are always primaries. + * Value type mirrors are never primaries; their corresponding + * box reference types are primaries. + * + * @return true if and only if this class is the primary + * representative of its underlying class file + */ + @HotSpotIntrinsicCandidate + public native boolean isPrimary(); + + /** + * Obtains the primary class corresponding to the specified + * {@code Class} object, if this class is a secondary class + * derived from a primary class + * If this class object is a primary class, it returns the + * same class object. + * TBD:An array type returns the primary class + * of its component type. A primitive type returns the + * corresponding wrapper type. A value type returns the + * primary class of its box type. A specialized generic + * returns the primary class of its template type. + * OR, an non-total version:Primitive and array classes + * do not have associated primary classes; they return + * {@code null} for this query. + * + * @return the primary representative of the underlying class file + */ + @HotSpotIntrinsicCandidate + public native Class getPrimaryClass(); + + /** + * Creates a new non-primary class for the given primary. + * This is an internal factory for non-primary classes. + * The user is expected to perform relevant interning, + * and manage the type of the user-data component. + * @param primary the primary class for the new secondary + * @param name arbitrary name string, to be the name of the new secondary + * @param userData arbitrary reference to associate with the new secondary + * @return a fresh secondary class + * @throws IllegalArgumentException if the first argument is not a primary class + */ + /*non-public*/ + @HotSpotIntrinsicCandidate + static native Class makeSecondaryClass(Class primary, String name, Object userData); + + /** + * Extract the user-data provided when the given secondary class + * was created by the {@code makeSecondaryClass} factory. + * Returns {@code null} if it was not created by that factory. + */ + /*non-public*/ + @HotSpotIntrinsicCandidate + native Object getSecondaryUserData(); + + /** * Returns true if this {@code Class} object represents an annotation * type. Note that if this method returns true, {@link #isInterface()} * would also return true, as all annotation types are also interfaces. From forax at univ-mlv.fr Wed Jun 21 15:56:47 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 21 Jun 2017 17:56:47 +0200 (CEST) Subject: minutes Valhalla EG June 07, 2017 In-Reply-To: <4E7249A7-DC7B-497C-B79E-D1471A934847@oracle.com> References: <4E7249A7-DC7B-497C-B79E-D1471A934847@oracle.com> Message-ID: <610036238.1762413.1498060607631.JavaMail.zimbra@u-pem.fr> John, Do we agree that this API can also replace the constant pool patching done by unsafe.defineAnonymousClass, i.e. that the Object can be any live Object? R?mi > De: "John Rose" > ?: "Daniel Heidinga" > Cc: valhalla-spec-experts at openjdk.java.net > Envoy?: Mercredi 21 Juin 2017 17:51:11 > Objet: Re: minutes Valhalla EG June 07, 2017 > On Jun 21, 2017, at 6:55 AM, Daniel Heidinga < Daniel_Heidinga at ca.ibm.com > > wrote: >> > AI: John - send out javadoc to EG >>> derived class := Class.derivedClassFactory(Class mainClass, T userData, String >> > name) >> In the spirit of the usual "5 min before the meeting" ritual action item panic, >> I'm trying to review the javadoc for this and can't seem to find it. Can it be >> sent again? > Apologies; here it is, 10 min before the meeting. > diff --git a/src/java.base/share/classes/java/lang/Class.java > b/src/java.base/share/classes/java/lang/Class.java > --- a/src/java.base/share/classes/java/lang/Class.java > +++ b/src/java.base/share/classes/java/lang/Class.java > @@ -681,6 +681,65 @@ > public native boolean isPrimitive(); > /** > + * Determines if the specified {@code Class} object is the > + * primary representative of an underlying class file. > + * Array and primitive classes are never primaries. > + * Other reference type constants of the form {@code X.class} > + * are always primaries. > + * Value type mirrors are never primaries; their corresponding > + * box reference types are primaries. > + * > + * @return true if and only if this class is the primary > + * representative of its underlying class file > + */ > + @HotSpotIntrinsicCandidate > + public native boolean isPrimary(); > + > + /** > + * Obtains the primary class corresponding to the specified > + * {@code Class} object, if this class is a secondary class > + * derived from a primary class > + * If this class object is a primary class, it returns the > + * same class object. > + * TBD:An array type returns the primary class > + * of its component type. A primitive type returns the > + * corresponding wrapper type. A value type returns the > + * primary class of its box type. A specialized generic > + * returns the primary class of its template type. > + * OR, an non-total version:Primitive and array classes > + * do not have associated primary classes; they return > + * {@code null} for this query. > + * > + * @return the primary representative of the underlying class file > + */ > + @HotSpotIntrinsicCandidate > + public native Class getPrimaryClass(); > + > + /** > + * Creates a new non-primary class for the given primary. > + * This is an internal factory for non-primary classes. > + * The user is expected to perform relevant interning, > + * and manage the type of the user-data component. > + * @param primary the primary class for the new secondary > + * @param name arbitrary name string, to be the name of the new secondary > + * @param userData arbitrary reference to associate with the new secondary > + * @return a fresh secondary class > + * @throws IllegalArgumentException if the first argument is not a primary > class > + */ > + /*non-public*/ > + @HotSpotIntrinsicCandidate > + static native Class makeSecondaryClass(Class primary, String name, > Object userData); > + > + /** > + * Extract the user-data provided when the given secondary class > + * was created by the {@code makeSecondaryClass} factory. > + * Returns {@code null} if it was not created by that factory. > + */ > + /*non-public*/ > + @HotSpotIntrinsicCandidate > + native Object getSecondaryUserData(); > + > + /** > * Returns true if this {@code Class} object represents an annotation > * type. Note that if this method returns true, {@link #isInterface()} > * would also return true, as all annotation types are also interfaces. From john.r.rose at oracle.com Wed Jun 21 16:01:07 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 21 Jun 2017 09:01:07 -0700 Subject: minutes Valhalla EG June 07, 2017 In-Reply-To: <610036238.1762413.1498060607631.JavaMail.zimbra@u-pem.fr> References: <4E7249A7-DC7B-497C-B79E-D1471A934847@oracle.com> <610036238.1762413.1498060607631.JavaMail.zimbra@u-pem.fr> Message-ID: On Jun 21, 2017, at 8:56 AM, Remi Forax wrote: > > John, > Do we agree that this API can also replace the constant pool patching done by unsafe.defineAnonymousClass, > i.e. that the Object can be any live Object? This seems close to possible, although I would imagine that a nest-injected class would usually be a primary class. That means getUserData would return non-null for a primary, which goes a little beyond what was envisioned in that patch. The userData value would be passed to the nest-injector function. ? John From john.r.rose at oracle.com Wed Jun 21 16:03:49 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 21 Jun 2017 09:03:49 -0700 Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: <1347399309.1758320.1498059966359.JavaMail.zimbra@u-pem.fr> References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> <1347399309.1758320.1498059966359.JavaMail.zimbra@u-pem.fr> Message-ID: On Jun 21, 2017, at 8:46 AM, Remi Forax wrote: > > but i do not like with this proposal as is, i will explain why and how to fix it: > - condy is linked to a static final field but unlike invokedynamic which is a link from an invokedynamic instruction to a CONSTANT_InvokeDynamic_info, > there is no link from the static final field to the CONSTANT_ConstantDynamic_info. > Why not reuse the ConstantValue attribute [1] to reference the CONSTANT_ConstantDynamic_info instead (the constantvalue_index can be extended to allow a CONSTANT_ConstantDynamic). > > - condy if a 'dy' like indy, so it should do late late binding, i.e. being initialized (run the bootstrap method) only the first time someone access to the static field exactly like with indy the bsm is called the first time you try to access the instruction. > > > In term of semantics, my proposal does not introduce an item in the constant pool which is resolved only by the virtue of being in the constant pool unlike any other items. If condy is linked to the ConstantValue of a field, the condy item is resolved when necessary as usual. With my ASM hat, i see how to implement it easily without having to surface the constant pool itself (at least until the items are pointed by the j.l.i.BootstrapCallInfo). Indeed, repurposing ConstantValue in the way you describe is an add-on to this proposal. I almost threw it in, but didn't want to muddy the basic proposal. In the basic proposal, condy is *not* linked to static finals. It only repurposes the concept of field names and field types (as if from Fieldref but not using Fieldref) but does not actually link to fields. ? John From paul.sandoz at oracle.com Wed Jun 21 18:13:41 2017 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Wed, 21 Jun 2017 11:13:41 -0700 Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> Message-ID: <5972DE49-B047-4F47-AC82-A2BD2CF9DB0E@oracle.com> > On 21 Jun 2017, at 01:46, John Rose wrote: > > On Jun 19, 2017, at 6:28 PM, John Rose wrote: >> >> I have updated the javadoc portion of the spec for constant-dynamic. >> There are small but strategic changes to MethodType and MethodHandle, >> as well as new classes to represent the "pull mode" of constant resolution. >> The key changes relevant to the JVM are in package-info.html, >> and (of course) in the forthcoming JVM spec. changes. >> >> http://cr.openjdk.java.net/~jrose/jvm/specdiff-condy-2017-0619.zip >> > > The forthcoming has come forth. Here is a semi-formatted and > color-annotated diff of the JVMS outlining proposed support for > dynamic constants, as well as some enhancements to the processing > of bootstrap methods: > > http://cr.openjdk.java.net/~jrose/jvm/condy-jvms-2017-0620.html > Nice, a thorougher set of changes. I like the trigger on arity. There is no dependency on the signature of the BSM (although as an implementation detail we can optimize). Trivially (perhaps as note?) a BSM is not limited to being solely referenced by a CONSTANT_ConstantDynamic_info or a CONSTANT_InvokeDynamic_info. Just like previously a BSM was not limited to being solely referenced by one kind of CONSTANT_InvokeDynamic_info. It?s now much easier to share BSMs if that is ones desire. Paul. From paul.sandoz at oracle.com Wed Jun 21 18:24:52 2017 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Wed, 21 Jun 2017 11:24:52 -0700 Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> <1347399309.1758320.1498059966359.JavaMail.zimbra@u-pem.fr> Message-ID: <91543D9D-8FA2-4366-9E18-AE52FB0197FE@oracle.com> > On 21 Jun 2017, at 09:03, John Rose wrote: > > On Jun 21, 2017, at 8:46 AM, Remi Forax > wrote: >> >> but i do not like with this proposal as is, i will explain why and how to fix it: >> - condy is linked to a static final field but unlike invokedynamic which is a link from an invokedynamic instruction to a CONSTANT_InvokeDynamic_info, >> there is no link from the static final field to the CONSTANT_ConstantDynamic_info. >> Why not reuse the ConstantValue attribute [1] to reference the CONSTANT_ConstantDynamic_info instead (the constantvalue_index can be extended to allow a CONSTANT_ConstantDynamic). >> >> - condy if a 'dy' like indy, so it should do late late binding, i.e. being initialized (run the bootstrap method) only the first time someone access to the static field exactly like with indy the bsm is called the first time you try to access the instruction. >> >> >> In term of semantics, my proposal does not introduce an item in the constant pool which is resolved only by the virtue of being in the constant pool unlike any other items. If condy is linked to the ConstantValue of a field, the condy item is resolved when necessary as usual. With my ASM hat, i see how to implement it easily without having to surface the constant pool itself (at least until the items are pointed by the j.l.i.BootstrapCallInfo). > > Indeed, repurposing ConstantValue in the way you describe is an add-on to this proposal. Can we get away with changing all static final fields to be lazily initialized without some explicit opt-in? It would be nice but it might induce subtle changes in behaviour and expectations (especially for where exceptions may occur). Paul. > I almost threw it in, but didn't want to muddy the basic proposal. > In the basic proposal, condy is *not* linked to static finals. > It only repurposes the concept of field names and field types > (as if from Fieldref but not using Fieldref) but does not actually link to fields. > > ? John From forax at univ-mlv.fr Wed Jun 21 19:20:00 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 21 Jun 2017 21:20:00 +0200 (CEST) Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: <91543D9D-8FA2-4366-9E18-AE52FB0197FE@oracle.com> References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> <1347399309.1758320.1498059966359.JavaMail.zimbra@u-pem.fr> <91543D9D-8FA2-4366-9E18-AE52FB0197FE@oracle.com> Message-ID: <1982125897.1804398.1498072800104.JavaMail.zimbra@u-pem.fr> > De: "Paul Sandoz" > ?: "John Rose" > Cc: "R?mi Forax" , valhalla-spec-experts at openjdk.java.net > Envoy?: Mercredi 21 Juin 2017 20:24:52 > Objet: Re: notes from Valhalla meeting 5/24/17 >> On 21 Jun 2017, at 09:03, John Rose < john.r.rose at oracle.com > wrote: >> On Jun 21, 2017, at 8:46 AM, Remi Forax < forax at univ-mlv.fr > wrote: >>> but i do not like with this proposal as is, i will explain why and how to fix >>> it: >>> - condy is linked to a static final field but unlike invokedynamic which is a >>> link from an invokedynamic instruction to a CONSTANT_InvokeDynamic_info, >>> there is no link from the static final field to the >>> CONSTANT_ConstantDynamic_info. >>> Why not reuse the ConstantValue attribute [1] to reference the >>> CONSTANT_ConstantDynamic_info instead (the constantvalue_index can be extended >>> to allow a CONSTANT_ConstantDynamic). >>> - condy if a 'dy' like indy, so it should do late late binding, i.e. being >>> initialized (run the bootstrap method) only the first time someone access to >>> the static field exactly like with indy the bsm is called the first time you >>> try to access the instruction. >>> In term of semantics, my proposal does not introduce an item in the constant >>> pool which is resolved only by the virtue of being in the constant pool unlike >>> any other items. If condy is linked to the ConstantValue of a field, the condy >>> item is resolved when necessary as usual. With my ASM hat, i see how to >>> implement it easily without having to surface the constant pool itself (at >>> least until the items are pointed by the j.l.i.BootstrapCallInfo). >> Indeed, repurposing ConstantValue in the way you describe is an add-on to this >> proposal. > Can we get away with changing all static final fields to be lazily initialized > without some explicit opt-in? I think it's too late for Java (the language). As far as i remember, this is by default in Dart. > It would be nice but it might induce subtle changes in behaviour and > expectations (especially for where exceptions may occur). Also, the order of the side effects will be different and how you see non initialized field (when you have recursion) but introducing a new keyword like lazy or stable is possible. > Paul. R?mi >> I almost threw it in, but didn't want to muddy the basic proposal. >> In the basic proposal, condy is *not* linked to static finals. >> It only repurposes the concept of field names and field types >> (as if from Fieldref but not using Fieldref) but does not actually link to >> fields. >> ? John From forax at univ-mlv.fr Wed Jun 21 19:37:15 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 21 Jun 2017 21:37:15 +0200 (CEST) Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> <1347399309.1758320.1498059966359.JavaMail.zimbra@u-pem.fr> Message-ID: <639463424.1806125.1498073835979.JavaMail.zimbra@u-pem.fr> > De: "John Rose" > ?: "R?mi Forax" > Cc: "Valhalla Expert Group Observers" > , > valhalla-spec-experts at openjdk.java.net > Envoy?: Mercredi 21 Juin 2017 18:03:49 > Objet: Re: notes from Valhalla meeting 5/24/17 [...] > It only repurposes the concept of field names and field types > (as if from Fieldref but not using Fieldref) but does not actually link to > fields. > ? John Ok, i do not understand exactly what you mean here, i will do a complete reading of the spec tomorrow so i hope things will be more clear for me. R?mi From paul.sandoz at oracle.com Wed Jun 21 20:45:06 2017 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Wed, 21 Jun 2017 13:45:06 -0700 Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> Message-ID: > On 19 Jun 2017, at 18:28, John Rose wrote: > > On Jun 9, 2017, at 1:51 PM, John Rose wrote: >> >> We have talked about "condy" (a constant-pool friend for indy). >> Enclosed is the javadoc portion of a draft spec. >> The JVM spec. is not ready and will be sent separately. >> This javadoc reflects code checked into the condy branch of Amber. > > I have updated the javadoc portion of the spec for constant-dynamic. > There are small but strategic changes to MethodType and MethodHandle, > as well as new classes to represent the "pull mode" of constant resolution. > The key changes relevant to the JVM are in package-info.html, > and (of course) in the forthcoming JVM spec. changes. > > http://cr.openjdk.java.net/~jrose/jvm/specdiff-condy-2017-0619.zip > package-summary ? "Bytecode may contain dynamic call sites equipped with equipped with bootstrap methods" s/equipped with equipped with/equipped with/ "This allows the bootstrap logic the ability to order the resolution of constants and catch linkage exceptions.? And possibly still link a constant rather than re-throwing a linkage exception? "The second-to-last example assumes that all extra arguments are of type CONSTANT_String? s/CONSTANT_String/String BootstrapCallInfo ? "This information include the method? s/include/includes Paul. > (I've given up sending attachments for now since the server > scrubs them.) > > ? John From paul.sandoz at oracle.com Wed Jun 21 20:59:47 2017 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Wed, 21 Jun 2017 13:59:47 -0700 Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: <639463424.1806125.1498073835979.JavaMail.zimbra@u-pem.fr> References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> <1347399309.1758320.1498059966359.JavaMail.zimbra@u-pem.fr> <639463424.1806125.1498073835979.JavaMail.zimbra@u-pem.fr> Message-ID: <8003EA4B-894E-482D-A977-24B67BBA2AC5@oracle.com> > On 21 Jun 2017, at 12:37, forax at univ-mlv.fr wrote: > > > > De: "John Rose" > ?: "R?mi Forax" > Cc: "Valhalla Expert Group Observers" , valhalla-spec-experts at openjdk.java.net > Envoy?: Mercredi 21 Juin 2017 18:03:49 > Objet: Re: notes from Valhalla meeting 5/24/17 > > [...] > It only repurposes the concept of field names and field types > (as if from Fieldref but not using Fieldref) but does not actually link to fields. > > ? John > > Ok, i do not understand exactly what you mean here, > i will do a complete reading of the spec tomorrow so i hope things will be more clear for me. > Relevant bit: CONSTANT_ConstantDynamic_info { u1 tag; u2 bootstrap_method_attr_index; u2 name_and_type_index; } The value of the name_and_type_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_NameAndType_info structure (?4.4.6) representing a field name and field descriptor (?4.3.2). i.e. name and descriptor indexed by CONSTANT_NameAndType_info must conform to a field name and field descriptor respectively, but there is no such actual field. Hth, Paul. From john.r.rose at oracle.com Wed Jun 21 23:40:18 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 21 Jun 2017 16:40:18 -0700 Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> Message-ID: Thanks; all fixed. On Jun 21, 2017, at 1:45 PM, Paul Sandoz wrote: > > > "This allows the bootstrap logic the ability to order the resolution of constants and catch linkage exceptions.? > > And possibly still link a constant rather than re-throwing a linkage exception? > No, there's no way to force the JVM to change a linkage decision once made. From john.r.rose at oracle.com Wed Jun 21 23:41:29 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 21 Jun 2017 16:41:29 -0700 Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: <8003EA4B-894E-482D-A977-24B67BBA2AC5@oracle.com> References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> <1347399309.1758320.1498059966359.JavaMail.zimbra@u-pem.fr> <639463424.1806125.1498073835979.JavaMail.zimbra@u-pem.fr> <8003EA4B-894E-482D-A977-24B67BBA2AC5@oracle.com> Message-ID: <2496C7A2-64D2-4B54-90E7-EABEDFA5F837@oracle.com> On Jun 21, 2017, at 1:59 PM, Paul Sandoz wrote: > > Relevant bit: > > CONSTANT_ConstantDynamic_info { > u1 tag; > u2 bootstrap_method_attr_index; > u2 name_and_type_index; > } > > The value of the name_and_type_index item must be a valid index into the constant_pool table. The constant_pool entry at > that index must be a CONSTANT_NameAndType_info structure (?4.4.6) representing a field name and field descriptor (?4.3.2). > > i.e. name and descriptor indexed by CONSTANT_NameAndType_info must conform to a field name and field descriptor respectively, but there is no such actual field. We could have slightly different rules for constant names, but that would complicate the structural constraints on C_NAT, with no particular benefit. As it is, you can validate a C_NAT without asking who is using it. From paul.sandoz at oracle.com Wed Jun 21 23:44:44 2017 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Wed, 21 Jun 2017 16:44:44 -0700 Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> Message-ID: > On 21 Jun 2017, at 16:40, John Rose wrote: > > Thanks; all fixed. > > On Jun 21, 2017, at 1:45 PM, Paul Sandoz wrote: >> >> >> "This allows the bootstrap logic the ability to order the resolution of constants and catch linkage exceptions.? >> >> And possibly still link a constant rather than re-throwing a linkage exception? >> > > No, there's no way to force the JVM to change a linkage decision once made. > A BSM called to resolve a constant can choose when pulling it?s static args to catch and swallow linkage exceptions, keep on trucking, and return a value. Should this be allowed? Paul. From john.r.rose at oracle.com Wed Jun 21 23:48:04 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 21 Jun 2017 16:48:04 -0700 Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: <608D7323-20B8-4871-896D-F22C5954A26B@oracle.com> References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> <1347399309.1758320.1498059966359.JavaMail.zimbra@u-pem.fr> <91543D9D-8FA2-4366-9E18-AE52FB0197FE@oracle.com> <608D7323-20B8-4871-896D-F22C5954A26B@oracle.com> Message-ID: On Jun 21, 2017, at 12:03 PM, Frederic Parain wrote: > >> >> Can we get away with changing all static final fields to be lazily initialized without some explicit opt-in? It would be nice but it might induce subtle changes in behaviour and expectations (especially for where exceptions may occur). > > Whatever solution is chosen, it will induce changes in behavior and new places where exceptions > can be thrown. Lazy initialized static fields allow us to support value types in static fields without > having to modify the class loading/class initialization logic. Today if a class A has a static field > of type class B, then class A can be initialized without having to load or initialized class B. > If the rule is changed for value types, meaning that if class A has a static field of type value > type V, then V has to be loaded and initialized before A can be initialized. And of course, > loading and initialization of V can fail, leading to a new kind of initialization failure for A. > > The class loading/class initialization is way more complex than field access (at least in the VM), > so we have to balance pros and cons of each solution. What I'm thinking here is that the JVM tracks the first time any given CONSTANT_Fieldref is resolved, whether or not the underlying class is initialized. This means the JVM already has a "hook" per field reference to take a slow path the first time the field is touched. As far as opt-in is concerned, I think yes the language has to supply an opt-in. Per-field lazy initialization is potentially very useful, but a cost is that users have to look at their code and possibly refactor in order to make use of it. (This is a theme with Java's initialization model: We ran out of automagic ways to transparently speed up start-up years ago; users have to step up but first we have to provide them with better tools and paradigms.) At the JVM level (regardless of the language) I think the right answer is to run a class's before any static reference (including a lazy one). Then, if the field itself is lazy, run its initializer. The effect of this is that blocks of statics which cannot be decoupled from each other stay normal, and get initialized before any lazy fields. Then the lazy fields are initialized one at a time. The above mentioned hook would work "as if" each lazy field were isolated in its own nested class, inside the nominal field holder class. That's not hard to implement, at all. And it's (probably) convenient to users, which is why we're considering it. The AOT people found that the practice of initializing *everything* if *anything* is touched made it hard to pick and choose among initializer activities, with the result that it was hard to selectively "shake out" only the initializer activities relevant to a particular application. We can do better. ? John From john.r.rose at oracle.com Wed Jun 21 23:49:28 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 21 Jun 2017 16:49:28 -0700 Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> Message-ID: <08D4AC8E-5F1E-4F83-8DCA-B255D85C6D73@oracle.com> On Jun 21, 2017, at 4:44 PM, Paul Sandoz wrote: > > A BSM called to resolve a constant can choose when pulling it?s static args to catch and swallow linkage exceptions, keep on trucking, and return a value. Should this be allowed? Absolutely. The BSM might have an API-specific fallback. I'm also thinking that we could add a query for the original symbolic reference, which the BSM could look and and perhaps resolve "by hand". This is similar in spirit to the original "invokedynamic" proposals as a form of "messageNotUnderstood". ? John From john.r.rose at oracle.com Wed Jun 21 23:53:19 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 21 Jun 2017 16:53:19 -0700 Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> Message-ID: <0175EE3F-3C56-4A8A-869C-73A26AF4CB68@oracle.com> On Jun 21, 2017, at 4:44 PM, Paul Sandoz wrote: > > A BSM called to resolve a constant can choose when pulling it?s static args to catch and swallow linkage exceptions, keep on trucking, and return a value. Should this be allowed? Suppose you have a pattern-matching switch with 100 cases, each with 10 method references in it. With a BSCI, you can resolve the 100 cases individually as control flows through them (or all up front?either way) and if any of the 1000 methods fails to link, the error can be held off until the relevant case expression is actually called upon. This is more Java-like than insisting that all errors are pushed up to the top of a complex structure that might contain errors. Another use case: Use a long series of CP expressions, (accessed via a BSCI.asList accessor) as fodder for an interpreter ("token codes") that executes a DSL. Again, you don't want a linkage error at position 100 to prevent execution of token 1, because you might never get to 100. ? John From paul.sandoz at oracle.com Wed Jun 21 23:59:59 2017 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Wed, 21 Jun 2017 16:59:59 -0700 Subject: notes from Valhalla meeting 5/24/17 In-Reply-To: <08D4AC8E-5F1E-4F83-8DCA-B255D85C6D73@oracle.com> References: <5FFBDEBF-ECAB-41AC-A9D9-ADC712446522@oracle.com> <925C62C6-26FB-4795-BF84-F8F62B2933C2@oracle.com> <08D4AC8E-5F1E-4F83-8DCA-B255D85C6D73@oracle.com> Message-ID: > On 21 Jun 2017, at 16:49, John Rose wrote: > > On Jun 21, 2017, at 4:44 PM, Paul Sandoz > wrote: >> >> A BSM called to resolve a constant can choose when pulling it?s static args to catch and swallow linkage exceptions, keep on trucking, and return a value. Should this be allowed? > > Absolutely. The BSM might have an API-specific fallback. > Thanks, i thought so. The returned value could capture the ConstantGroup and do more resolution lazily on demand. At some point if the value is tickled in the right way a linkage error might result. Paul. > I'm also thinking that we could add a query for the original symbolic reference, > which the BSM could look and and perhaps resolve "by hand". > This is similar in spirit to the original "invokedynamic" proposals > as a form of "messageNotUnderstood". > > ? John From karen.kinnear at oracle.com Fri Jun 23 20:33:05 2017 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Fri, 23 Jun 2017 16:33:05 -0400 Subject: Valhalla EG minutes 6/21/17 Message-ID: attendees: Remi, Bjorn, Dan H, Dan S, John, Maurizio, Frederic, Lois, Karen AIs: All: review Dan Smith?s proposals MVT JVMS: Specification for Value Classes: http://cr.openjdk.java.net/~dlsmith/values.html - initial proposal *** let?s pin this down ASAP so we - Remi for ASM, IBM and Oracle can deliver early binaries for early adopters to try incremental proposals for post early access (or maybe post - MVT TBD) Direct Value Class: Specification for Value Classes with Explicit Declarations: http://cr.openjdk.java.net/~dlsmith/values-declaration.html Specification for Value Classes with CONSTANT_ClassType: http://cr.openjdk.java.net/~dlsmith/values-classtype.html#values-classtype-4.1 All: review John Rose?s proposals: ConstantDynamic JVMS changes: http://cr.openjdk.java.net/~jrose/jvm/condy-jvms-2017-0620.html note: this is orthogonal to MVT java/lang/Class.java makeSecondaryClass: http://mail.openjdk.java.net/pipermail/valhalla-spec-experts/2017-June/000286.html post EA Hotspot and IBM: what could be available for early access for early adopters to experiment with? revisit early access timing - if we were to set expectations that - model would be to deliver binaries and periodic binary updates to match the source builds. This is not a one shot delivery. - limit functionality (platforms, reflection behavior unspecified, no JVMTI, ?) - performance improvements are not yet there, expect to come in incrementally - maybe verifier isn?t ready? - stability issues Potential early adopters: Ian Graves? Doug Lea? Others? So - would you be willing to start experimenting with minimal value types even with the restrictions above? We would find it helpful to get your feedback on - the basic conceptual model - usage model - use cases - so we can optimize what you care about - required features we missed - so that when we ship this experimentally it is much closer to what you need Model of usage: 1) Value-capable-class: created in java with annotation, javac generates a regular classfile with no new constant pool entries or bytecodes 2) MethodHandles and ValueType APIs - this is the default model of usage 3) generated byte codes - you can generate your own byte codes to work on value types - at this point you can?t generate your own value type class this way (until we get to Direct Value Class support) Constant Pool changes for Early Access: 1) proposal for a value class is: part I: CONSTANT_Class_info: UTF8: ?;Q;? // i.e. this would be a descriptor using a UTF8 string to speed up implementation part II: hotspot requests that we use a different name for the value type than for the value capable class implementation request - today we need unique strings to identify unique runtime types and this is baked in multiple places So propose: CONSTANT_Class_info: UTF8:?QFoo$ Where is the name exposed? in the constant pool when you generate your own bytecodes - in descriptors and class names Dan S: Longer term: will declare a value class directly. The box first is a temporary approach in which we derive the value class. Propose we not spend a lot of time here blurring the difference and needing to hide the derived value class. Bjorn: single representation for both value capable class and derived value class I think you said ?treat ;Q? as the name for the derived value class rather than as an escape character (feel free to correct my notes) Dan S. Longer term: we do want one declaration and two views based on the value class. John: prefers Bjorn?s approach Bjorn: could hack using the different name as two views of the same thing e.g. vbox/vunbox would need to swap names (ed. note - please let us know if this is doable without heroic efforts) John: for hotspot - part of condy refactoring did part of the loaded class cache lookup changes that could be used here. Dan S: class loading in the proposed JVMS: if you see $Value 1) first derive the VCC name and see if already resolved 2) if not - load the VCC, check properties and derive (ed. note - if see VCC - lazily derive derived value class on touch) ---- John sent out a proposed API for a secondary mirror (see email link above) note: not for EA Dan H: if ask for the same name for the secondary mirror what happens? John: only libraries can use the proposed API and library is responsible for interning the name - not the VM. Remi: need ?nest? automatically for secondary mirror John: yes eventually Karen: not EA - need to check if time during MVT Remi: dynamic language implementor will want the same name - e.g. to get to shared static methods today - because you can?t re-open a class folks generate ancillary classes to add static methods later note: for printing purposes it would be helpful to have a different way to represent the name John: model on primitive type vs. wrapper type VWithfield - propose for MVT - allow package private access - since there are no methods on the derived value class and the value capable class can?t have any methods with vbytecodes since generated by javac - plan to make private when we add factory methods to value classes with a compiler (and we have nest support) Discussion of work needed to get to early access from various parties. See question above for early adopters on potential restrictions to get this to you sooner. Teams need to re-assess timing assuming we want to make this available before we cross all the t?s and dot the i?s (you just have to recognize this goes against the grain for any virtual machine engineer), but we do appreciate that our first adopters would like to get this this summer while they have time to experiment and have shown lots of willingness to work with us) Good news is: With the current JVMS (let?s get that reviewed and stamped), Remi is looking at modifying ASM so folks will find it easier to generate byte codes. Many thanks! ASM needs: 1) new opcodes and overloaded opcodes 2) descriptor support note: this is independent of condy Maurizio - we would like to use that ASM internally whenever it is ready - that would make the MethodHandle API able to take advantage of existing optimizations for references that we haven?t done yet So - goal is to have a binary snapshot available ASAP. Maurizio suggested we look at the JVMLS workshop time separately - need to discuss that next meeting. ?? Exposure of java/lang/__Value? Hotspot uses this today internally for MethodHandle LambdaForms - generating e.g. vreturn for __Value (which is derived value class top type) . This is internal implementation magic - we pretend this is a marker class. Note: instanceof and checkcast do NOT work with value types in MVT. John: Longer-term: exploring QObject equivalent and a UObject which is at least a reference or value type When we support interfaces and generics for value types we will need a user story users can trust Concern: ASM verification John proposed: use invokeBasic model - wormhole from untyped to typed which is ignored by the verifier thanks, Karen From forax at univ-mlv.fr Sat Jun 24 13:27:19 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 24 Jun 2017 15:27:19 +0200 (CEST) Subject: Integration of ASM 7 into the MVT Message-ID: <1816425461.3647817.1498310839354.JavaMail.zimbra@u-pem.fr> Hi all, re-reading the spec, http://cr.openjdk.java.net/~dlsmith/values.html i've remarked that it seems that the spec is a forked from the JVMS 8 and not JVMS 9. I'm planning to create an alpha release of ASM 7 by forking ASM 6 that does already implement the JVMS 9 (from the Java Platform Module Spec), i hope this is not an issue. Thinking a little more about that, there is an issue with integrating ASM 7 into the valhalla repository, currently the code inside the JDK relies on ASM 5.1 and not ASM 6. The difference between ASM 6 and ASM 5 is that in ASM 5, the attribute "Module" is seen as an unknown attribute so ASM ask the user code to provide a description of the attribute, in ASM 6, ASM recognizes the attribute "Module" and redirect all the info to a specific ModuleVisitor, so before to be able to use ASM 7 with the new v-opcodes inside the JDK, i think that the JDK sources has to be updated to use ASM 6 first. regards, R?mi From john.r.rose at oracle.com Sat Jun 24 23:52:04 2017 From: john.r.rose at oracle.com (John Rose) Date: Sat, 24 Jun 2017 16:52:04 -0700 Subject: constant-dynamic specification, updated Message-ID: I have updated the javadoc API that is relevant to the proposed JVM features for dynamic constants and more powerful bootstrap methods. http://cr.openjdk.java.net/~jrose/jvm/condy-jvms-2017-0620.html Here is a rough draft of the corresponding JVMS changes: http://cr.openjdk.java.net/~jrose/jvm/specdiff-condy-2017-0624.zip Please enjoy and comment. ? John From john.r.rose at oracle.com Mon Jun 26 02:17:47 2017 From: john.r.rose at oracle.com (John Rose) Date: Sun, 25 Jun 2017 19:17:47 -0700 Subject: class, type, instance, object, value In-Reply-To: References: Message-ID: <1F516BE7-30D2-47AF-9B83-3F8AA0A00FEE@oracle.com> So, I'm writing more and more documentation that discusses objects and primitives while bringing values into the mix. What seems right to me is that we allow the terms "class", "type", and "instance" to symmetrically cover both legacy object types and new value types. We should continue to use the word "value" but be careful about distinguishing its overloadings, especially its role as an absolute noun vs. its role as an adjective. We should tolerate asymmetries that arise from the reference vs. value distinction, and from box types which arise from value classes. Summary: Classes = Object Classes <+> Value Classes Instances = Object Instances <+> Value Instances Object Instances = instances of Object Classes <+> boxes of Value Instances Reference Values = Object Instances <+> null Values (noun) = Reference Values <+> Value Instances <+> Primitives (?where <+> denotes disjoint union) Details: - A "class" is at root metadata describing a type or implementation. (It has API surface and/or implementation: super types, methods, fields, etc.) - An "instance" is derived from a class and/or conforms to that class's API. - An instance of an "object class" (or "object type") is an "object instance" (or just "object"). - An instance of an "value class" (or "object value") is a "value instance" (or just "value" if context allows). - When clarity is at risk, we can call a value class or value instance a "non-object class" or "non-object value". - Because object instances are referred to by reference, a variable bound to one is a "reference". (References can be to object instances, to boxes of value instances, or to the unique reference null.) - A reference can also take "null" ("the null reference") as a value. - References, primitives, and value instances are all "values" since they are passed by value. - An instance of a "value class" is a "value instance" or (when clarity is not at risk) just a "value". - Because value instances are referred to "by value", a variable bound to one is just a "value". (When clarity is at risk, such a variable can be called a "pure value" or "non-reference".) Ambiguity: - The term "value" used as a noun can refer to the contents of a variable: reference, primitive, or value instance. - The term "value" used as an adjective distinguishes a class, type, or instance from the "object" version. - The term "value" can abbreviate "value instance"; context must clear this usage from ambiguity. - Thus, the term "value" must always be used in a context which resolves it ambiguity. (We could coin a new term to avoid ambiguity, but the meaning of "value" perfect, so let's keep it.) - Sometimes when we say "value type" we really mean "non-object type", and expect primitives to be included. - As part of fit-and-finish of Value Types we will give the primitives a comfortable seat at the table. (Perhaps we can cleverly ret-con primitive types as value types, and their wrappers as boxes thereof.) Boxing, buffering, identity: - A value can be "boxed" into an object. Such an object can be "unboxed" back into its value. - Boxed values are true objects, with object type. - The class of a boxed value is the value class. (Thus each value class derives at least two types.) - A value (of any sort) does not have identity, only the object instance under a non-null reference does. - A "boxed value" (or "value is an object has identity, since it is an object instance. - If an implementation uses pointer indirection to access a value, we say it is stored in a "buffer". (This avoids confusion, since boxes are objects but buffers are not necessarily objects.) - Buffers are invisible to the user, except perhaps via performance effects, or trusted APIs like Unsafe. - Buffers can be on the stack, the heap, or anywhere else in memory. - Boxes can secretly serve as buffers. False friends: - A "java.lang.Class" is usually a reference to class metadata, but not necessarily unique. There's wiggle room here for class-for-the-box vs. class-for-the-value, and int.class. We don't allow java.lang.Class to constrain other uses of the term "class". - When clarity is at risk, we can say "class mirror" rather than just "Class". - Similar point for "CONSTANT_Class" in the constant pool schema. Relatively few folks are conscious of this term anyway. - "Object-oriented" programming usually refers to some combination of classes with reference-based polymorphism. Value types are object oriented, even though their reference-based polymorphism is limited to interfaces. Also they are squarely based on classes. Glossary: value type: a type which may be used without an accompanying reference (i.e., no intrinsic reference identity or aliasing) value class: a code entity which defines a value type value instance: a possible value (at runtime) of a variable of value type, derived from a value class value field: (ambig.) field whose type is a value type (in any kind of class) OR a field in a value class (of any type) value parameter: (ambig.) a parameter whose type is a value type OR a parameter, with emphasis on by-value transmission value: (ambig.) a reference, value instance, or primitive OR context-dependent ellipsis for value type/class/instance object type: a type without references (i.e., no reference identity, no aliasing) object class: a code entity which defines an object type object instance: a possible value (at runtime) of a non-null variable of reference type object field/parameter: (ambig., see above) object: (ambig.) a reference to an object instance OR context-dependent ellipsis for value type/class/instance box type: an object type derived from a value class box instance: a possible value (at runtime) of a non-null variable of box type box: (ambig., see above) instance/class/type: (ambig., see above) From john.r.rose at oracle.com Mon Jun 26 05:07:57 2017 From: john.r.rose at oracle.com (John Rose) Date: Sun, 25 Jun 2017 22:07:57 -0700 Subject: class, type, instance, object, value In-Reply-To: <1F516BE7-30D2-47AF-9B83-3F8AA0A00FEE@oracle.com> References: <1F516BE7-30D2-47AF-9B83-3F8AA0A00FEE@oracle.com> Message-ID: <34FC5F36-6860-41B1-BD6D-E40E9B165AA7@oracle.com> On Jun 25, 2017, at 7:17 PM, John Rose wrote: > > object: (ambig.) a reference to an object instance OR context-dependent ellipsis for value type/class/instance s/ellipsis for value/ellipsis for object/ From john.r.rose at oracle.com Mon Jun 26 18:41:49 2017 From: john.r.rose at oracle.com (John Rose) Date: Mon, 26 Jun 2017 11:41:49 -0700 Subject: class, type, instance, object, value In-Reply-To: References: <1F516BE7-30D2-47AF-9B83-3F8AA0A00FEE@oracle.com> Message-ID: On Jun 26, 2017, at 5:16 AM, Bjorn B Vardal wrote: > >> > value field: (ambig.) field whose type is a value type (in any kind of class) OR a field in a value class (of any type) > > If you want to resolve this ambiguity, I've been referring to the former as a "value typed field" and the latter as a "value field". > Yes, that can work where the context is strong enough to keep the reader alert. But a single letter 'd' is a slender hook to hang your meaning on, and its *absence* is even more delicate. Unless the context makes it very clear, I'd want to say "value class field", or "field in a value" (ellipsis for "value class") instead of "value field" as you suggest. (Remember, classes define fields, and then they show up in a type's API surface. So you can usually clarify "field of some type" to "field in some class" or "field of that type's class". And on the other hand "field with that type" or even "field of that type". The word "of" is treacherous here, and "in" and "with" are more reliable.) In some cases you have to spit out even more words to be safe: "field within value type", "field that is a member of a value type", "field declared as a value type", "field whose type is a value type", etc., etc. Some of those circumlocutions benefit from ellipsis (which is "value" standing for "value type" with the "type" clear from context). Note that Java is already full of small ambiguities like this: "Interface field", "inner class field", "wrapper type field". Perhaps we could lean harder on the class-vs-type distinction: "Field of an interface type" is a "field typed as an interface (type)" whereas "Field of an interface class" is a "field declared in a class which defines an interface". (But we don't say "interface class" at this point, and it would be hard to get folks to accept it, even though it would clarify existing ambiguities about "class" meaning "class or interface or enum" in many places.) ? John From paul.sandoz at oracle.com Mon Jun 26 18:52:15 2017 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Mon, 26 Jun 2017 11:52:15 -0700 Subject: Valhalla EG minutes 6/21/17 In-Reply-To: References: Message-ID: <7CD0F59D-4376-4340-98A4-ED186630E114@oracle.com> > On 23 Jun 2017, at 13:33, Karen Kinnear wrote: > VWithfield - propose for MVT - allow package private access - since there are no methods on the derived value class > and the value capable class can?t have any methods with vbytecodes since generated by javac > - plan to make private when we add factory methods to value classes with a compiler (and we have nest support) > I am unsure if it?s necessary for MVT purposes to dial back the accessibility then dial it up again later on. ValueType.findWither can be used in conjunction with MethodHandle.privateLookupIn. It?s a little odd but works. What am i missing? Paul. From dl at cs.oswego.edu Sun Jun 25 22:08:37 2017 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 25 Jun 2017 18:08:37 -0400 Subject: Valhalla EG minutes 6/21/17 In-Reply-To: References: Message-ID: <4bce389e-3342-bcfa-0174-140ef4313134@cs.oswego.edu> I missed this meeting (out at PLDI/ISMM/ECOOP) but ... On 06/23/2017 04:33 PM, Karen Kinnear wrote: > > Hotspot and IBM: > what could be available for early access for early adopters to > experiment with? > revisit early access timing - if we were to set expectations that > - model would be to deliver binaries and periodic binary updates to > match the source builds. This is not a one shot delivery. > - limit functionality (platforms, reflection behavior unspecified, > no JVMTI, ?) > - performance improvements are not yet there, expect to come in > incrementally > - maybe verifier isn?t ready? > - stability issues > > *Potential early adopters: Ian Graves? Doug Lea? Others?* > * So - would you be willing to start experimenting with minimal value > types even with the restrictions above?* > Yes. This should be enough to try out some useful experiments. At least if "not there yet" doesn't mean "dreadful" performance. For example, trying out techniques for parallel sorting of values seems possible, and is likely to be informative about other possible concurrent/parallel library improvements. -Doug