From karen.kinnear at oracle.com Wed Sep 12 22:22:09 2018 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 12 Sep 2018 18:22:09 -0400 Subject: Valhalla EG meeting notes Sep 12 2018 Message-ID: <37F57FB5-B197-450E-868C-A3C564CA1B5E@oracle.com> Attendees: Remi, Brian, Tobi, Dan H, Dan S, Frederic, Karen AI: Karen - future topic: lazy static field initialization Corrections welcome. Value Type Nullability offsite next week. Agenda: Brian anti-agenda: let?s focus on the near term e.g. not cover primitives or value-based-class migration goal: what can we provide in LW10? worst case for VT: support arrays, support erased generics with equivalent of boxing this would be useful what is stopping us Remi: do we need value types for pattern matching? Brian: value types would make pattern matching much easier simulate method returns with multiple values currently boxing on heap if we could provide value types, even if not fully optimized, that would be ok other Amber features - would like to try as value types Karen: if erased generics performance was the same as for Objects, would that be good enough long term? Brian: Not good enough long term Brian: We need: 1. nullable values 2. user story user abstraction for primitives was boxing, what do tell the users about the model? to channel Kevin - two part type system was bad enough, do we have to have a three part type system? consider box: value class -> nullable value class believe this gives us maneuvering room for VBC migration later Karen: What it make sense to consider a primitive proposal that converts primitives to null-free value types (rather than the current nullable Object subtype?) Brian: Worth considering. Need to deal resolve issues with auto boxing today: Object o = 3 auto-boxes to j.l.Integer. Remi: Glad to have the language level discussion, we have been avoiding those and we need this. Brian: there are couplings between the language and vm e.g. MethodHandle.asType conversions - match the language behavior knows about boxing int->Integer Does Point also need a box? Frederic: what does the box provide today? Brian: For primitives: the box supports 1. store as a Object subtype 2. nullability (works with erased generics) 3. use a* bytecodes, e.g. if_acmp_eq/ne 4. sync/wait - note: this is ?accidental identity?, may be ok to give up boxes currently provide stable identities Frederic: In LW1 a value type provides everything here except identity Brian: including null enforcement Frederic: Clarify: in LW1 value types are nullable. Period. Containers can declare that they can contain null or not. value arrays are all non-nullable value types in LW1 are 1. subtypes of object, 2. support nullability and 3. support a* bytecodes Brian: brought up a VBC example Frederic: We want the language to support the distinction between nullable and null-free types. Remi: Only need this for type arguments, not for all types, only for erased generics Brian: also need this for arrays Frederic: yes for arrays, it would be nice to have for all reference types, sufficient to support for value types now Brian: Sufficient yes. Is this the minimal solution, i.e. least invasive? There is a huge follow-on cost. consider the box model Remi: nullable value type is box of null free, provides vm semantics Brian: can teach ?asType? the transform Remi: user level syntax - TBD. Brian: agreed. Dan H: How much can we shrink the scope of the problem? e.g. VBC migration - can we give up? forever? e.g. erased generics can we constrain the problem? Remi: remove VBC from equation. Dan H: Brian said earlier ok to not solve VBC, but continues to bring it up. Erased generics issues: 1. null and 2. backing arrays Brian: could decide values not interoperate with generics at all, but that would not give users the full benefit of value types Believe that users understand concept of a box as a related class, and it could help with VBC migration Remi: Primitives are in the same class of problems, eventually. If we do NOT see primitives types as value types reified generics will be complex. Karen: primitives with a conversion to value types to use with reified generics. Not always value types - I would expect to get the performance we have today. Frederic: LW1 uses NO boxes today. JIT could not optimize boxes, can we consider a model without boxes? Brian: MVT, L&Q signatures were messy with boxes. With LWorld all those problems go away. Frederic: Question of the number of types in the vm. Remi: Two types in the language level and 1 in VM. Karen: Goal of this exercise is: 1) user model requirements to support erased generics - requires null and null-free references to value types 2) JIT optimizations in general for value types (not for erased generics, but in general code and in future in reified generics) - depend on null-free guarantees. Our goal: minimal user model disruption for maximal JIT optimization. The way I read your email Remi - I thought you had user model disruption, but no information passed to the JIT, so no optimization benefit. Remi: If inlining, the JIT has enough information. Frederic: Actually with field and method signatures it makes a huge difference in potential JIT optimizations. Remi: Erased generics should have ok performance Frederic: We are talking about performance without generics - we want full optimization there. Remi: what happens in LW1 if you send null to a value type generic parameter? Frederic: LW1 supports nullable value types. We want to guarantee/enforce null-free vs. nullable distinction in the vm. Remi: there are two kinds of entry points in the JIT?d code Frederic: 2: i2c which does null checks and if null calls the interpreter, c2c - disallows nulls. editor?s note: did you get the implication - if we see a null, you are stuck in the interpreter today because we all rely on dynamic checks. Remi: For the vm, if you have a real nullable VT JIT can optimize/deopt/reopt Frederic: this is brittle, lots of work, and uncertain Remi: VM does not want nullable value types Frederic: VM wants to be able to distinguish null-free vs. nullable value types, so for null-free we can optimize like qtypes and fo nullable, we can get back to Object performance. Brian: Original MVT. Got optimizations, cost was too high in terms of boxing for Object. What do we need to add back? Remi: Not convinced we need to do something. Some programs pass null, some collections never see null. Frederic: HashMap, List, Vectors, ArrayList - APIs - null has semantic meaning tried to generate code to handle both nullable and null-free types with single source, without continual null checks - couldn?t do it Remi: only have to enforce at public APIs, not even line Karen: Since we are speaking about Generic Specialization, I need to ask Brian help with requirements. In order to clarify what we need to do for Erased Generics, we do not need to have all the details of a Specialized Generics solution. However, we do need to have a model of direction that is viable. We need to understand the transition backward compatibility requirements from Erased Generics to Specialized Generics so that we don?t make this harder for ourselves. Brian: 1. Migration VBC->VT e.g. LocalDateTime 2. EG -> SG migration In a perfect world, we would provide source and binary compatibility. e.g. ArrayList: EG->SG 1. author: opt-in: change code 2. client: what do they have to do 3. subclasses: what do they have to do It is imperative that there be no flag day. We will see a mix and match of the timetable in practice. We would like the migrations to null pull the carpet from under clients and subclasses. e.g. ArrayList should still have the same meaning, still allow nulls. Karen: Can we take two different examples please? 1. Source code that currently has no built-in nullability assumptions 2. Source code that currently has APIs in which null has semantic meaning. Brian: During Model 1 exploration, explored JDK generic class author changes: e.g. Conditional code - tried with #ifref for reference vs. value - got out of control quickly, e.g. with multiple type variables For nullability: specializer could constant fold if (T==null) to false if we knew non-null This allows writing generic code in a null friendly style. e.g. ArrayList if T is a String, include source logic: if (T != null) specialize for ArrayList, T never null: constant fold to false in the class rewriter/instantiator For examples with semantic nulls, e.g. Map.get()/put() ?Peeling?: peel into erased and generic layers default - all members in all layers. Add source annotations - which only make sense based on the type system, e.g. for references Put Map.get() ?in jail?: only for an erased Hashmap, not for HashMap over int add other methods Karen: What happens if an existing erased client tries to use a jailed method? Brian: Generated a matching signature in the erased client that throws a NSME or other exception, so all species have all methods (ed. note - I think this means all species have all methods from the erased client, some species may have additional optional species-specific methods). Brian: Today TreeSet. Javac only lets you instantiate a TreeSet over a Comparable, otherwise generates an exception. Frederic: Map.put() - returns null if entry did not already exist - how handle? Remi: Alternative - could return default value of the type, if you have reified generics Brian: Known problems with migrating existing libraries Karen: just to double-check - if you have an existing library that does not have null assumptions, do we expect the clients to continue to work if we move from EG->SG and have a newly created Class? Brian: This would be a good assumption. Frederic: will the EG -> SG transition be explicit? Brian: All transitions will be explicit e.g. ArrayList<> -> ArrayList e.g. instantiation - goal is not to change e.g. client - goal - change code if you want a new behavior goal: Existing instantiation should continue to work with an existing client - both unchanged. EG->SG have to opt-in VBC->VC have to opt-in Brian: Back to boxes. Need to handle in source Remi: I don?t think we need this, we can use class-dynamic Brian: let?s not deep-dive right now Karen: My concern: EG->SG, with a new instantiator, existing clients may not work. They may get a compilation time exception. Existing binaries may get a runtime exception. I think this will be worse if we support EG over null-free types. Brian: Do not want LW10 to support Generic, only nullable Point. We all agree! That is why I asked Srikanth to reject value types for generic parameters today. Remi: Need to provide EG with nullable VT. Frederic: In LW1, there are lots of hacks that let you get null Remi: Summarize we want an easy plan to get null-free/nullable VT in language/vm expensive to add general null-free/nullable types to language Frederic: want the same name Brian: We will discuss the details of ?straightening the wire? as John says - working all the details end-to-end next week user mode/language/class file/runtime Remi: Need a backup plan: just for value types which is less invasive My earlier email was describing the least benefit plan if we did not need nullable value types in the language and not in the class file, always erase in vm Brian: Actually we have a range of choices on how to do this. Remi: MVT worked if you generated your own byecode Remi: Is a nullable VT the same as the box? Brian: Box is a user model concept, they know properties. Karen: Help me with how boxes might work for primitives. One of my concerns with boxes is the assumed hard-code duality. Today we have int -> boxed to Integer. If we want int -> box to a value type is this a null-free value type? do we then box again to get to a nullable value type? or do we initially box to a nullable value type? Would unbox then unbox to a null-free value type like other value types or unbox to an int? Remi: Vm point of view and Language point of view. Big mistake auto boxing primitives to Object. If you know the target type is a specific type of box, then boxing is easy. What if an int for the VM was a Value Type? Karen: I do not expect current performance if you were to replace all primitives today with Value Types. Let?s not start our design with a different model for the language and vm and vm magic. Remi: already magic for primitives. Frederic: not magic, there are well-defined behaviors for primitives. Remi: Is the only place we need primitives as Object subtypes for specialized generics? Brian: generics is the big pain point. Want to avoid needing 8 (or 9 if you count void) manual specializations. Remi: Plan is to specialize the class for any value, not for a given type ed. note - did I hear you correctly? I am expecting possible specialization by type - e.g. ValInt vs. ValLong Dan S: Boxing question for Brian: Does conversion allow identity conversion? i.e. instance of a different class when you go from value -> value.box? Brian: User?s perception of type system distinct types value and box, can freely convert except null in one direction explicit conversions, or conditions in the language Frederic: Problem: null-free VT v1; nullable VT v2; v = v2; getClass() return the same answer or not? For the vm today, these point to the same thing. There is no copying on conversion. If you have two distinct types, you must copy. Remi: you could change getClass behavior Dan H: Issue with arrays. Karen: With arrays if you have an array with null-free/flattened elements -> convert to nullable elements, you have to copy. Frederic: actually if each type is a different element type, then you have to create all new elements, not just copy. Dan H: getClass may leak Brian: int[] vs. Integer[] today you must copy No precedent for arrays in terms of expecting no copy with an explicit box, getComponentType returns a different type Frederic: field analogy Dan S: high level suggestion: more natural to think of nullable/null-free using a widening/narrowing analogy rather than box/unbox Remi: widening/narrowin requires subtype Dan S: it is an identity with checks, does not assume copy Brian: Adjoint pair next week: new model go through cases in John?s original email of ?works like an int/long? - see if the widen/narrow model works Remi: Really in the VM we have two different types: VT and nullable VT. Can we do this? What is the cost for the vm check? Dan H: How do we see this in the vm? Descriptors? A bit? other? Remi: everywhere. Dan H: Killer is the vtable - what are the inheritance/overriding rules Frederic: if we had two separate class files and bytecodes - sure we could do this If you want one source and 2 types in the vm, there are many complexities e.g. JVMTI - if you add a breakpoint, where do you want the breakpoint? Karen: Please read JDK-8204937 - Type Operators in the JVM We have three approaches for vm representation: 1. 2 different types 2. 2 different species 3. 2 different references to one type Brian: Need to think about getClass (class file) vs. getCrass Many thanks for a lively discussion! Karen From daniel.smith at oracle.com Wed Sep 12 23:46:31 2018 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 12 Sep 2018 17:46:31 -0600 Subject: JVM alternatives for supporting nullable value types Message-ID: <9ACEDF04-2DD7-475A-925A-FB593EA62A31@oracle.com> For LW10, one of our goals is to support interactions between value types and erased generics by having some form of a nullable value type. The needs of the language factor heavily into the JVM design. We're not ready to commit to language-level details, but it's likely that the language will support nullable and non-nullable variations of the types declared by value classes; and these variations will probably be supported in most places that types can appear. More generally, the language may support up to three different flavors of nullability on some or all types: - null-free: a type that does not include null (could be spelled Foo!) - null-permitting: a type that allows but ignores nulls (could be spelled Foo~) - null-checked: a type that allows and checks for nulls (could be spelled Foo?) (Please note that this is placeholder syntax. There are lots of ways to map this to real syntax. Unadorned names will map to one of these; it's possible that migrating a class to be a value class will change the interpretation of its unadorned name.) Null-permitting and null-checked types are both "nullable"; the difference is in how strongly the compiler enforces null checks. ("Null-permitting" is the existing behavior for types like 'String'; "null-checked" is the style that requires proof that nulls are absent before dereferencing.) The other important concept from the language is conversions: - A widening conversion (or something similar) supports treating a value of a null-free type as null-permitting or null-checked - A "null-free conversion" is required to go in the opposite direction, and includes a runtime null check - A "nullability conversion", like an unchecked conversion, might allow other forms of conversions between types involving different nullabilities, including in their type arguments or array component type. Turning to the JVM with those language-level concepts in mind, I've put together the following summary of four main designs we've considered. The goal here is not to reach a conclusion about which path is best, but to make sure we're accurately considering all of the implications in each case. Nullable value types, null-free storage --------------------------------------- In this approach, we use regular L types to represent value types, and these types are nullable. Fields and arrays, via some sort of modifier, may choose to be nullable or null-free. JVM implications - Need a mechanism (new opcode?) to indicate that an array allocation is null-free - The default value of a field/array depends on whether the "null-free" modifier is used - Fields and arrays that are marked null-free can, of course, be flattened - Stack variables and method parameters/returns may always be null - A putfield, putstatic, or aastore may fail with an NPE (or maybe ASE) - JIT can optimistically assume no nulls and scalarize, but must check and de-opt when a null is encountered - The "null-free" modifier is only allowed with value class types, and must be validated early (e.g., to decide on field layout) Compilation strategy Val? maps to LVal; Val~ maps to LVal; Val! maps to LVal; The nullability of the type in a field declaration or array creation expression determines whether the "null-free" modifier is used or not. Nullability conversions are no-ops; null-free conversions are either compiled to explicit null checks or are implicit in a invoke*/getfield/putfield. Language implications - Null-free value types typically get flattened storage and scalarized invocations - Array store runtime checks may include a null check - Methods may not be overloaded on different nullabilities of the same type - Null-free parameters/returns may be polluted with nulls due to inconsistent compilation or non-Java interop?detected with an NPE on storage or dereference - A conversion from Val~[] to Val![] could be supported, but the result would not perform the expected runtime checks Migration implications - Refactoring a class to be a value class is a binary compatible change (except where this involves incompatible changes like removing a public constructor); before recompilation (which may reinterpret some unadorned names), treatment of nulls does not change - Changing the nullability of a type is a binary compatible change; library clients who expect nullable storage may see surprising NPEs or ASEs Always null-free value types ---------------------------- In this approach, we use regular L types to represent value types, and these types are null-free. Non-value L types continue to be nullable. A use-site attribute tracks which class names represent value classes; validation lazily ensures consistency with the declaration. JVM implications - Fields, arrays, and method parameters and returns with value class types can be flattened/scalarized - The 'null' verification type is not a subtype of any value class types - Casts to value class types must fail on 'null' (CCE or NPE) - At method preparation, field/method resolution, and class loading, a check similar to class loader constraints ensures that classes agree on value classes in the descriptor - Various other vectors for getting data into the JVM should prevent nulls, or have contracts that allow crashing, etc., if data is corrupted - Classes in the value classes attribute are allowed to be loaded early (e.g., to decide on field layout) - If the value classes attribute does not mention a value class, it's possible for variables/fields of that type to be null, but an error will occur when an attempt is made to load the class or resolve against a class that disagrees Compilation strategy Val? maps to Ljava/lang/Object; Val~ maps to Ljava/lang/Object; Val! maps to LVal; Every referenced value class is listed in the value classes attribute. Nullability conversions are no-ops; null-free conversions are compiled to checkcasts (even for member access). Casts that target Val?/Val~ compile to a checkcast guarded by a null check, where null always succeeds. Language implications - Null-free value types typically get flattened storage and scalarized invocations - Array store runtime checks may include a null check - Val~[] and Val?[] do not perform array store checks at all?any Object may end up polluting these arrays (creating arrays of these types might be treated as an error, like T[]) - Val~ and Val? are overloading-hostile: their use in signatures conflicts with Object and all other null-permitting/null-checked value types - Null-permitting/null-checked value type parameters and returns may be polluted with other types due to inconsistent compilation or non-Java interop?detected with a CCE on null-free conversion - A conversion from Val~[] to Val![] cannot be allowed Migration implications - Refactoring a class to be a value class is a binary incompatible change due to inconsistent value class attributes - Changing from a null-permitting/null-checked to null-free type (or vice versa) is a binary incompatible change unless there's some form of support for type migrations Null-free types with new descriptors ------------------------------------ In this approach, we use regular L types to represent nullable value types, and introduce other types (spelled, say, with a "K") to represent null-free value types. K types are subtypes of L types, and casts can be used to convert from L to K. JVM implications - Descriptor syntax needs to support 'K' - To support K casts, we need ClassRefs that indicate K-ness, a new opcode, or some other mechanism - Fields, arrays, and method parameters and returns with K types can be flattened/scalarized - The 'null' verification type is not a subtype of K types - Casts to K types must fail on 'null' - Various other vectors for getting data into the JVM should prevent nulls, or have contracts that allow crashing, etc., if data is corrupted - Classes named by K types are allowed to be loaded early (e.g., to decide on field layout) Compilation strategy Val? maps to LVal; Val~ maps to LVal; Val! maps to KVal; Nullability conversions are no-ops; null-free conversions are either compiled to explicit casts or are implicit in an invoke*/getfield/putfield. Language implications - Null-free value types typically get flattened storage and scalarized invocations - Array store runtime checks may include a null check - Methods may be overloaded with a null-free type vs. a null-permitting/null-checked type (but null-permitting vs. null-checked is not allowed) - Pollution of null-free variables or arrays is impossible - A conversion from Val~[] to Val![] cannot be allowed Migration implications - Refactoring a class to be a value class is a binary compatible change (except where this involves incompatible changes like removing a public constructor); before recompilation (which may reinterpret some unadorned names), treatment of nulls does not change - Changing from a null-permitting/null-checked to null-free type (or vice versa), is a binary incompatible change unless there's some form of support for type migrations Nullability notations on types ------------------------------ In this approach, we use regular L types to represent value types, and these types are nullable by default. To indicate that a particular field, array, or parameter/return is null-free, some form of side notation is used. (Deliberately using the word "notation" rather than "annotation" or "modifier" here to avoid committing to an encoding.) This is similar to "nullable value types, null-free storage", except that the null-free notation can be used on method parameters/returns. This is similar to "always null-free value types", except that instead of tracking value classes in each class file, we track null-free value types per use site. This is similar to "null-free types with new descriptors", except that the notations are not part of descriptors and don't require any explicit conversions?they are not part of the verification type system. JVM implications - Need a mechanism to encode notations, both for descriptors and for array creations - The default value of a field/array depends on whether the "null-free" notation is used - Fields, arrays, and method parameters and returns that are marked null-free can be flattened/scalarized - Stack variables may generally be null, unless a static analysis proves otherwise - A putfield, putstatic, aastore, or method invocation may fail with an NPE (or maybe ASE) - Method overriding allows nullability mismatches; calls must be able to dynamically adapt (e.g., through multiple v-table entries and VM-generated bridges) - Types marked null-free are allowed to be loaded early (e.g., to decide on field layout) Compilation strategy Where '*' represents a side notation that a type is null-free: Val? maps to LVal; Val~ maps to LVal; Val! maps to LVal;* Nullability conversions are no-ops; null-free conversions are either compiled to explicit null checks or are implicit in a invoke*/getfield/putfield. Language implications - Null-free value types typically get flattened storage and scalarized invocations - Array store runtime checks may include a null check - Methods may not be overloaded on different nullabilities of the same type - Pollution of null-free variables arrays, or parameters/returns is impossible - A conversion from Val~[] to Val![] could be supported, but the result would not perform the expected runtime checks Migration implications - Refactoring a class to be a value class is a binary compatible change (except where this involves incompatible changes like removing a public constructor); before recompilation, treatment of nulls does not change - Changing the nullability of a type is a binary compatible change; library clients who expect a nullable API may see surprising NPEs or ASEs From forax at univ-mlv.fr Thu Sep 13 01:26:40 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 13 Sep 2018 03:26:40 +0200 (CEST) Subject: JVM alternatives for supporting nullable value types In-Reply-To: <9ACEDF04-2DD7-475A-925A-FB593EA62A31@oracle.com> References: <9ACEDF04-2DD7-475A-925A-FB593EA62A31@oracle.com> Message-ID: <1263793689.1257227.1536802000322.JavaMail.zimbra@u-pem.fr> Thanks Daniel ! There is another variation of the last semantics, instead of allowing the type notation on method parameters, you have a boolean in the attribute ValueTypes that indicates if the value type is nullable or not. So you are only allowed to decide class wide is a value type is nullable or not, in term of syntax it's equivalent of allowing !, ? and - only when importing the type (i know that you do not have to import a type, it's to explain how it works). Nullability notations on types (using class wide side notations) ------------------------------ In this approach, we use regular L types to represent value types, and these types are nullable by default. To indicate that a particular field, array, or parameter/return is null-free, some form of side notation is used in the ValueTypes attribute. JVM implications - use the attribute ValueTypes to encode nullable-ness notations - The default value of a field/array depends on whether the "null-free" notation is used - Fields, arrays, and method parameters and returns that are null-free can be flattened/scalarized - Stack variables may generally be null, unless a static analysis proves otherwise - A putfield, putstatic, aastore, or method invocation may fail with an NPE (or maybe ASE) - Method overriding allows nullability mismatches; calls must be able to dynamically adapt (e.g., through multiple v-table entries and VM-generated bridges) - Types marked null-free are allowed to be loaded early (e.g., to decide on field layout) Compilation strategy Where '*' represents a side notation that a type is null-free: Val? maps to LVal; Val~ maps to LVal; Val! maps to LVal;* Nullability conversions are no-ops; null-free conversions are either compiled to explicit null checks or are implicit in a invoke*/getfield/putfield. Language implications - Null-free value types typically get flattened storage and scalarized invocations - Array store runtime checks may include a null check - Methods may not be overloaded on different nullabilities of the same type - Pollution of null-free variables arrays, or parameters/returns is impossible - A conversion from Val~[] to Val![] could be supported, but the result would not perform the expected runtime checks Migration implications - Refactoring a class to be a value class is a binary compatible change (except where this involves incompatible changes like removing a public constructor); before recompilation, treatment of nulls does not change - Changing the nullability of a type is a binary compatible change; library clients who expect a nullable API may see surprising NPEs or ASEs R?mi ----- Mail original ----- > De: "daniel smith" > ?: "valhalla-spec-experts" > Envoy?: Jeudi 13 Septembre 2018 01:46:31 > Objet: JVM alternatives for supporting nullable value types > For LW10, one of our goals is to support interactions between value types and > erased generics by having some form of a nullable value type. > > The needs of the language factor heavily into the JVM design. We're not ready to > commit to language-level details, but it's likely that the language will > support nullable and non-nullable variations of the types declared by value > classes; and these variations will probably be supported in most places that > types can appear. > > More generally, the language may support up to three different flavors of > nullability on some or all types: > - null-free: a type that does not include null (could be spelled Foo!) > - null-permitting: a type that allows but ignores nulls (could be spelled Foo~) > - null-checked: a type that allows and checks for nulls (could be spelled Foo?) > > (Please note that this is placeholder syntax. There are lots of ways to map this > to real syntax. Unadorned names will map to one of these; it's possible that > migrating a class to be a value class will change the interpretation of its > unadorned name.) > > Null-permitting and null-checked types are both "nullable"; the difference is in > how strongly the compiler enforces null checks. ("Null-permitting" is the > existing behavior for types like 'String'; "null-checked" is the style that > requires proof that nulls are absent before dereferencing.) > > The other important concept from the language is conversions: > - A widening conversion (or something similar) supports treating a value of a > null-free type as null-permitting or null-checked > - A "null-free conversion" is required to go in the opposite direction, and > includes a runtime null check > - A "nullability conversion", like an unchecked conversion, might allow other > forms of conversions between types involving different nullabilities, including > in their type arguments or array component type. > > Turning to the JVM with those language-level concepts in mind, I've put together > the following summary of four main designs we've considered. The goal here is > not to reach a conclusion about which path is best, but to make sure we're > accurately considering all of the implications in each case. > > > Nullable value types, null-free storage > --------------------------------------- > > In this approach, we use regular L types to represent value types, and these > types are nullable. Fields and arrays, via some sort of modifier, may choose to > be nullable or null-free. > > > JVM implications > > - Need a mechanism (new opcode?) to indicate that an array allocation is > null-free > - The default value of a field/array depends on whether the "null-free" modifier > is used > - Fields and arrays that are marked null-free can, of course, be flattened > - Stack variables and method parameters/returns may always be null > - A putfield, putstatic, or aastore may fail with an NPE (or maybe ASE) > - JIT can optimistically assume no nulls and scalarize, but must check and > de-opt when a null is encountered > - The "null-free" modifier is only allowed with value class types, and must be > validated early (e.g., to decide on field layout) > > > Compilation strategy > > Val? maps to LVal; > Val~ maps to LVal; > Val! maps to LVal; > > The nullability of the type in a field declaration or array creation expression > determines whether the "null-free" modifier is used or not. > > Nullability conversions are no-ops; null-free conversions are either compiled to > explicit null checks or are implicit in a invoke*/getfield/putfield. > > > Language implications > > - Null-free value types typically get flattened storage and scalarized > invocations > - Array store runtime checks may include a null check > - Methods may not be overloaded on different nullabilities of the same type > - Null-free parameters/returns may be polluted with nulls due to inconsistent > compilation or non-Java interop?detected with an NPE on storage or dereference > - A conversion from Val~[] to Val![] could be supported, but the result would > not perform the expected runtime checks > > > Migration implications > > - Refactoring a class to be a value class is a binary compatible change (except > where this involves incompatible changes like removing a public constructor); > before recompilation (which may reinterpret some unadorned names), treatment of > nulls does not change > - Changing the nullability of a type is a binary compatible change; library > clients who expect nullable storage may see surprising NPEs or ASEs > > > > Always null-free value types > ---------------------------- > > In this approach, we use regular L types to represent value types, and these > types are null-free. Non-value L types continue to be nullable. A use-site > attribute tracks which class names represent value classes; validation lazily > ensures consistency with the declaration. > > > JVM implications > > - Fields, arrays, and method parameters and returns with value class types can > be flattened/scalarized > - The 'null' verification type is not a subtype of any value class types > - Casts to value class types must fail on 'null' (CCE or NPE) > - At method preparation, field/method resolution, and class loading, a check > similar to class loader constraints ensures that classes agree on value classes > in the descriptor > - Various other vectors for getting data into the JVM should prevent nulls, or > have contracts that allow crashing, etc., if data is corrupted > - Classes in the value classes attribute are allowed to be loaded early (e.g., > to decide on field layout) > - If the value classes attribute does not mention a value class, it's possible > for variables/fields of that type to be null, but an error will occur when an > attempt is made to load the class or resolve against a class that disagrees > > > Compilation strategy > > Val? maps to Ljava/lang/Object; > Val~ maps to Ljava/lang/Object; > Val! maps to LVal; > > Every referenced value class is listed in the value classes attribute. > > Nullability conversions are no-ops; null-free conversions are compiled to > checkcasts (even for member access). Casts that target Val?/Val~ compile to a > checkcast guarded by a null check, where null always succeeds. > > > Language implications > > - Null-free value types typically get flattened storage and scalarized > invocations > - Array store runtime checks may include a null check > - Val~[] and Val?[] do not perform array store checks at all?any Object may end > up polluting these arrays (creating arrays of these types might be treated as > an error, like T[]) > - Val~ and Val? are overloading-hostile: their use in signatures conflicts with > Object and all other null-permitting/null-checked value types > - Null-permitting/null-checked value type parameters and returns may be polluted > with other types due to inconsistent compilation or non-Java interop?detected > with a CCE on null-free conversion > - A conversion from Val~[] to Val![] cannot be allowed > > > Migration implications > > - Refactoring a class to be a value class is a binary incompatible change due to > inconsistent value class attributes > - Changing from a null-permitting/null-checked to null-free type (or vice versa) > is a binary incompatible change unless there's some form of support for type > migrations > > > > Null-free types with new descriptors > ------------------------------------ > > In this approach, we use regular L types to represent nullable value types, and > introduce other types (spelled, say, with a "K") to represent null-free value > types. K types are subtypes of L types, and casts can be used to convert from L > to K. > > > JVM implications > > - Descriptor syntax needs to support 'K' > - To support K casts, we need ClassRefs that indicate K-ness, a new opcode, or > some other mechanism > - Fields, arrays, and method parameters and returns with K types can be > flattened/scalarized > - The 'null' verification type is not a subtype of K types > - Casts to K types must fail on 'null' > - Various other vectors for getting data into the JVM should prevent nulls, or > have contracts that allow crashing, etc., if data is corrupted > - Classes named by K types are allowed to be loaded early (e.g., to decide on > field layout) > > > Compilation strategy > > Val? maps to LVal; > Val~ maps to LVal; > Val! maps to KVal; > > Nullability conversions are no-ops; null-free conversions are either compiled to > explicit casts or are implicit in an invoke*/getfield/putfield. > > > Language implications > > - Null-free value types typically get flattened storage and scalarized > invocations > - Array store runtime checks may include a null check > - Methods may be overloaded with a null-free type vs. a > null-permitting/null-checked type (but null-permitting vs. null-checked is not > allowed) > - Pollution of null-free variables or arrays is impossible > - A conversion from Val~[] to Val![] cannot be allowed > > > Migration implications > > - Refactoring a class to be a value class is a binary compatible change (except > where this involves incompatible changes like removing a public constructor); > before recompilation (which may reinterpret some unadorned names), treatment of > nulls does not change > - Changing from a null-permitting/null-checked to null-free type (or vice > versa), is a binary incompatible change unless there's some form of support for > type migrations > > > > Nullability notations on types > ------------------------------ > > In this approach, we use regular L types to represent value types, and these > types are nullable by default. To indicate that a particular field, array, or > parameter/return is null-free, some form of side notation is used. > (Deliberately using the word "notation" rather than "annotation" or "modifier" > here to avoid committing to an encoding.) > > This is similar to "nullable value types, null-free storage", except that the > null-free notation can be used on method parameters/returns. > > This is similar to "always null-free value types", except that instead of > tracking value classes in each class file, we track null-free value types per > use site. > > This is similar to "null-free types with new descriptors", except that the > notations are not part of descriptors and don't require any explicit > conversions?they are not part of the verification type system. > > > JVM implications > > - Need a mechanism to encode notations, both for descriptors and for array > creations > - The default value of a field/array depends on whether the "null-free" notation > is used > - Fields, arrays, and method parameters and returns that are marked null-free > can be flattened/scalarized > - Stack variables may generally be null, unless a static analysis proves > otherwise > - A putfield, putstatic, aastore, or method invocation may fail with an NPE (or > maybe ASE) > - Method overriding allows nullability mismatches; calls must be able to > dynamically adapt (e.g., through multiple v-table entries and VM-generated > bridges) > - Types marked null-free are allowed to be loaded early (e.g., to decide on > field layout) > > > Compilation strategy > > Where '*' represents a side notation that a type is null-free: > > Val? maps to LVal; > Val~ maps to LVal; > Val! maps to LVal;* > > Nullability conversions are no-ops; null-free conversions are either compiled to > explicit null checks or are implicit in a invoke*/getfield/putfield. > > > Language implications > > - Null-free value types typically get flattened storage and scalarized > invocations > - Array store runtime checks may include a null check > - Methods may not be overloaded on different nullabilities of the same type > - Pollution of null-free variables arrays, or parameters/returns is impossible > - A conversion from Val~[] to Val![] could be supported, but the result would > not perform the expected runtime checks > > > Migration implications > > > - Refactoring a class to be a value class is a binary compatible change (except > where this involves incompatible changes like removing a public constructor); > before recompilation, treatment of nulls does not change > - Changing the nullability of a type is a binary compatible change; library > clients who expect a nullable API may see surprising NPEs or ASEs From forax at univ-mlv.fr Thu Sep 13 09:09:51 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 13 Sep 2018 11:09:51 +0200 (CEST) Subject: Valhalla EG meeting notes Sep 12 2018 In-Reply-To: References: <37F57FB5-B197-450E-868C-A3C564CA1B5E@oracle.com> Message-ID: <1366714046.56743.1536829791650.JavaMail.zimbra@u-pem.fr> Hi Tobias, [Switching back to valhalla-spec-experts] ----- Mail original ----- > De: "Tobias Hartmann" > ?: "valhalla-dev" > Envoy?: Jeudi 13 Septembre 2018 09:33:19 > Objet: Re: Valhalla EG meeting notes Sep 12 2018 > [Switching from valhalla-spec-experts to valhalla-dev] > > Just wanted to add my 2 cents to the JIT part of this discussion: > > On 13.09.2018 00:22, Karen Kinnear wrote: >> Frederic: LW1 uses NO boxes today. JIT could not optimize boxes, can we consider >> a model without boxes? > > +1 > >> Brian: MVT, L&Q signatures were messy with boxes. With LWorld all those problems >> go away. >> Frederic: Question of the number of types in the vm. >> Remi: Two types in the language level and 1 in VM. >> Karen: Goal of this exercise is: >> 1) user model requirements to support erased generics - requires null and >> null-free references to value types >> 2) JIT optimizations in general for value types (not for erased generics, but in >> general code and in future in reified generics) - depend on null-free >> guarantees. >> Our goal: minimal user model disruption for maximal JIT optimization. > > As explained below, for the JIT it would be optimal to be able to statically > (i.e. at compile time) > distinguish between nullable and null-free value types. We could then emit > highly optimized code for > null-free value types and fall back to java.lang.Object performance (or even > better) for nullable > value types. > >> The way I read your email Remi - I thought you had user model disruption, but no >> information passed to the JIT, so no optimization benefit. >> Remi: If inlining, the JIT has enough information. Let me clarify because during that part of the conf call both me and Frederic were not on the same planet. Here, i was talking about a strategy to implement nullable value type at the language level, this is neither a discussion about implementing non nullable value type in the language nor a discussion about implementing non nullable value type in the VM. This strategy is named "Always null-free value types" in latest Dan Smith email, again it's "null free value types" from the language POV, not the VM POV. > > Yes but even with aggressive inlining, we still need null checks/filtering at > the "boundaries" to be > able to optimize (scalarize) nullable value types: > - Method entry with value type arguments > - Calls of methods returning a value type > - Array stores/loads > - Field stores/loads > - Loading a value type with a nullable (non-flattenable) field > - Checkcast to a value tyoe > - When inlining method handle intrinsics through linkTo and casting Object > arguments to value type > - OSR entry with a live value type > - Every place where we can see constant NULL for a value type in the bytecodes > - Some intrinsics yes, for non nullable value type, for nullable value type the idea is to erase them to Object (or their first super interface). > >> Frederic: Actually with field and method signatures it makes a huge difference >> in potential JIT optimizations. > > Yes, it would make above null filtering unnecessary for null-free value types. > >> Remi: Erased generics should have ok performance >> Frederic: We are talking about performance without generics - we want full >> optimization there. Here, you can see the mis-communication issue in plain sight, Frederic is thinking about the semantics of non nullable value types in the VM. >> Remi: what happens in LW1 if you send null to a value type generic parameter? >> Frederic: LW1 supports nullable value types. We want to guarantee/enforce >> null-free vs. nullable distinction in the vm. > >> Remi: there are two kinds of entry points in the JIT?d code >> Frederic: 2: i2c which does null checks and if null calls the interpreter, c2c - >> disallows nulls. >> editor?s note: did you get the implication - if we see a null, you are stuck in >> the interpreter today because we all rely on dynamic checks. > > Yes, that's an important point. > >> Remi: For the vm, if you have a real nullable VT JIT can optimize/deopt/reopt >> Frederic: this is brittle, lots of work, and uncertain yes, i fully agree with Frederic, i can live with non first class support of nullable value type in the VM. > > That's what we currently have with LW1. Although the language exposure is > limited by javac, null > value types are *fully* supported in the VM/JIT but with a huge performance > impact. We deoptimize > when encountering NULL but do not attempt to re-compile without scalarization. > We could do that, but > that would mean that whenever you (accidentally) introduce NULL into your > well-written, highly > optimized value type code, performance will drop significantly and stay at that > level. The fact that LW1 do nullcheck for value type is something independent in my opinion, we need these nullcheck for migration and not necessarily for supporting nullable value type. > > Of course, there are ways to optimize this even without null-free value types. > For example by > profiling and speculation on nullness. Or by having two compiled versions of the > same method, one > that supports nullable value types by passing them as pointers (no > deoptimization) and one that > scalarizes null-free value types to get peek performance. > > But as Frederic mentioned, these approaches are limited, complex and the gain is > uncertain. It's > maybe similar to escape analysis: We might be able to improve performance under > certain conditions > but it's very limited. I think the whole point of value types is performance, so > we should try to > get that right. Technically it's better than "plain" escape analysis because there is no identity so you can re-box only at the edges. Anyway, there is a bigger issue, as John said, it means you have to support a really weird semantics for method call because each parameter type which is a value type can be null or not, so if you want to specialize on that it means you have a combinatorial explosion of all the possibilities (2 pow n with n the number of value type in the parameter list), you can try to be lazy on that but if you think in term of vtable a method will have to support it's own specialization + all the specializations of the overriden methods. > > To summarize: If we just need to support null value types and are fine with null > screwing up performance and having weird side effects, we are basically done today. If null > value types need to perform well (i.e. similar to j.l.Object), optimally we would need to be able to > statically distinguish between nullable and null-free value types. As i said above, i'm fine with the current semantics of non nullable value type in the VM because if we have a null that appears it's because of separate compilation/migration. I like your last sentence, because it's the whole point of the strategy to erase nullable value type in Java to Object in the classfile, a nullable value type will perform as well as java.lang.Object so instead of trying to introduce a way to denote nullable value type in the classfile, let's erase nullable value type as Object. Obviously, i'm lying here because if you erase something to Object, you need a supplementary cast and when you erase something you can have method signature clash so it a trade off but the advantage of this proposal is that the VM doesn't have to bother to understood nullable value type and having a simple JVM spec is a HUGE win. > >> Remi: VM does not want nullable value types >> Frederic: VM wants to be able to distinguish null-free vs. nullable value types, >> so for null-free we can optimize like qtypes and fo nullable, we can get back to >> Object performance. > > Yes, exactly. Both sentences are true :) Those are different strategies. > > Best regards, > Tobias regards, R?mi From tobias.hartmann at oracle.com Fri Sep 14 07:02:24 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 14 Sep 2018 09:02:24 +0200 Subject: Valhalla EG meeting notes Sep 12 2018 In-Reply-To: <1366714046.56743.1536829791650.JavaMail.zimbra@u-pem.fr> References: <37F57FB5-B197-450E-868C-A3C564CA1B5E@oracle.com> <1366714046.56743.1536829791650.JavaMail.zimbra@u-pem.fr> Message-ID: <2dfc309d-fb89-4bfb-cd6c-8bffafcfb2a9@oracle.com> Hi Remi, thanks a lot for the clarifications! Makes sense to me now. Best regards, Tobias On 13.09.2018 11:09, Remi Forax wrote: > Hi Tobias, > > [Switching back to valhalla-spec-experts] > > ----- Mail original ----- >> De: "Tobias Hartmann" >> ?: "valhalla-dev" >> Envoy?: Jeudi 13 Septembre 2018 09:33:19 >> Objet: Re: Valhalla EG meeting notes Sep 12 2018 > >> [Switching from valhalla-spec-experts to valhalla-dev] >> >> Just wanted to add my 2 cents to the JIT part of this discussion: >> >> On 13.09.2018 00:22, Karen Kinnear wrote: >>> Frederic: LW1 uses NO boxes today. JIT could not optimize boxes, can we consider >>> a model without boxes? >> >> +1 >> >>> Brian: MVT, L&Q signatures were messy with boxes. With LWorld all those problems >>> go away. >>> Frederic: Question of the number of types in the vm. >>> Remi: Two types in the language level and 1 in VM. >>> Karen: Goal of this exercise is: >>> 1) user model requirements to support erased generics - requires null and >>> null-free references to value types >>> 2) JIT optimizations in general for value types (not for erased generics, but in >>> general code and in future in reified generics) - depend on null-free >>> guarantees. >>> Our goal: minimal user model disruption for maximal JIT optimization. >> >> As explained below, for the JIT it would be optimal to be able to statically >> (i.e. at compile time) >> distinguish between nullable and null-free value types. We could then emit >> highly optimized code for >> null-free value types and fall back to java.lang.Object performance (or even >> better) for nullable >> value types. >> >>> The way I read your email Remi - I thought you had user model disruption, but no >>> information passed to the JIT, so no optimization benefit. >>> Remi: If inlining, the JIT has enough information. > > Let me clarify because during that part of the conf call both me and Frederic were not on the same planet. > Here, i was talking about a strategy to implement nullable value type at the language level, this is neither a discussion about implementing non nullable value type in the language nor a discussion about implementing non nullable value type in the VM. > > This strategy is named "Always null-free value types" in latest Dan Smith email, again it's "null free value types" from the language POV, not the VM POV. > >> >> Yes but even with aggressive inlining, we still need null checks/filtering at >> the "boundaries" to be >> able to optimize (scalarize) nullable value types: >> - Method entry with value type arguments >> - Calls of methods returning a value type >> - Array stores/loads >> - Field stores/loads >> - Loading a value type with a nullable (non-flattenable) field >> - Checkcast to a value tyoe >> - When inlining method handle intrinsics through linkTo and casting Object >> arguments to value type >> - OSR entry with a live value type >> - Every place where we can see constant NULL for a value type in the bytecodes >> - Some intrinsics > > yes, for non nullable value type, for nullable value type the idea is to erase them to Object (or their first super interface). > >> >>> Frederic: Actually with field and method signatures it makes a huge difference >>> in potential JIT optimizations. >> >> Yes, it would make above null filtering unnecessary for null-free value types. >> >>> Remi: Erased generics should have ok performance >>> Frederic: We are talking about performance without generics - we want full >>> optimization there. > > Here, you can see the mis-communication issue in plain sight, Frederic is thinking about the semantics of non nullable value types in the VM. > >>> Remi: what happens in LW1 if you send null to a value type generic parameter? >>> Frederic: LW1 supports nullable value types. We want to guarantee/enforce >>> null-free vs. nullable distinction in the vm. >> >>> Remi: there are two kinds of entry points in the JIT?d code >>> Frederic: 2: i2c which does null checks and if null calls the interpreter, c2c - >>> disallows nulls. >>> editor?s note: did you get the implication - if we see a null, you are stuck in >>> the interpreter today because we all rely on dynamic checks. >> >> Yes, that's an important point. >> >>> Remi: For the vm, if you have a real nullable VT JIT can optimize/deopt/reopt >>> Frederic: this is brittle, lots of work, and uncertain > > yes, i fully agree with Frederic, i can live with non first class support of nullable value type in the VM. > >> >> That's what we currently have with LW1. Although the language exposure is >> limited by javac, null >> value types are *fully* supported in the VM/JIT but with a huge performance >> impact. We deoptimize >> when encountering NULL but do not attempt to re-compile without scalarization. >> We could do that, but >> that would mean that whenever you (accidentally) introduce NULL into your >> well-written, highly >> optimized value type code, performance will drop significantly and stay at that >> level. > > The fact that LW1 do nullcheck for value type is something independent in my opinion, we need these nullcheck for migration and not necessarily for supporting nullable value type. > >> >> Of course, there are ways to optimize this even without null-free value types. >> For example by >> profiling and speculation on nullness. Or by having two compiled versions of the >> same method, one >> that supports nullable value types by passing them as pointers (no >> deoptimization) and one that >> scalarizes null-free value types to get peek performance. >> >> But as Frederic mentioned, these approaches are limited, complex and the gain is >> uncertain. It's >> maybe similar to escape analysis: We might be able to improve performance under >> certain conditions >> but it's very limited. I think the whole point of value types is performance, so >> we should try to >> get that right. > > Technically it's better than "plain" escape analysis because there is no identity so you can re-box only at the edges. > > Anyway, there is a bigger issue, as John said, it means you have to support a really weird semantics for method call because each parameter type which is a value type can be null or not, so if you want to specialize on that it means you have a combinatorial explosion of all the possibilities (2 pow n with n the number of value type in the parameter list), you can try to be lazy on that but if you think in term of vtable a method will have to support it's own specialization + all the specializations of the overriden methods. > >> >> To summarize: If we just need to support null value types and are fine with null >> screwing up performance and having weird side effects, we are basically done today. If null >> value types need to perform well (i.e. similar to j.l.Object), optimally we would need to be able to >> statically distinguish between nullable and null-free value types. > > As i said above, i'm fine with the current semantics of non nullable value type in the VM because if we have a null that appears it's because of separate compilation/migration. > > I like your last sentence, because it's the whole point of the strategy to erase nullable value type in Java to Object in the classfile, a nullable value type will perform as well as java.lang.Object so instead of trying to introduce a way to denote nullable value type in the classfile, let's erase nullable value type as Object. > Obviously, i'm lying here because if you erase something to Object, you need a supplementary cast and when you erase something you can have method signature clash so it a trade off but the advantage of this proposal is that the VM doesn't have to bother to understood nullable value type and having a simple JVM spec is a HUGE win. > >> >>> Remi: VM does not want nullable value types >>> Frederic: VM wants to be able to distinguish null-free vs. nullable value types, >>> so for null-free we can optimize like qtypes and fo nullable, we can get back to >>> Object performance. >> >> Yes, exactly. > > Both sentences are true :) > Those are different strategies. > >> >> Best regards, >> Tobias > > regards, > R?mi > From forax at univ-mlv.fr Tue Sep 18 10:32:16 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 18 Sep 2018 12:32:16 +0200 (CEST) Subject: Reified generics - shadow class edition Message-ID: <1329608472.207285.1537266736467.JavaMail.zimbra@u-pem.fr> Reified generics - shadow class edition. I believe that try to make method descriptor variant is a bad idea, it comes from the model 1...3 experimentation but it's an artifact of such implementations, not a concept. Here i describe a way to keep generics erased even if they are reified. If the descriptor is erased, we need a way to get reified type argument at runtime so you can use 'checkcast' to verify that the arguments that are parameterized have the right type. By example class Holder { E element; // can be specialized to throw a NoSuchMethodError if E is void E get() { return element; } void set(E element) { this.element = element; } } will be translated into class Holder { Object element; Object get() { return element; } void set(Object element) { element checkcast Es // verify that element is the type argument of E here this.element = element; } } Now to bridge the gap, we also need: - a way to explain to the VM at runtime that the field 'element' is specialized (if it's a value type) - a way explain to the VM at runtime that the methods get and set has different implementations For that i proposed a new mechanism in the VM called master class/shadow class which is a way define a specialized class, the shadow class, from a template class, the master class. In my example, Holder is the master class and Holder with Complex a value type is a shadow class 'derived' at runtime from the master class. This mechanism is more general than just supporting type specialization in the VM because - we do not want to inject the Java generic semantics in the VM or the Scala semantics, or the Kotlin semantics, etc. - we can support more use cases, so other languages can by example associate a constant an int to a class like in C++. So the idea is introduce two things that works together: 1) implement in the VM a mechanism that allows to add constant objects as supplementary values (class data) when defining a class 2) use a bootstrap method (to "go meta" as John said) to allow to specialize fields and methods of such class Those two features may be cleanly separated in the future, but i'm not sure how to do that, so for now, let say they are two parts of the same feature, the master class/shadow class feature. For (1), we need a class file attribute that describe the class of each class data, we don't need to name them, it can be positional (for java generics we may introduce another attribute or re-use one existing to find the name of the class data if they are type parameter). For (2), we need to specify a boostrap method that will be called to describe how the specialization should be done. Considering (1) and (2) as a unique feature means you can have the same class attribute definining the class data and the boostrap method. The MasterClass attribute MasterClass_attribute u2 attribute_name_index; u4 attribute_length; u2 number_of_class_data; { u2 descriptor u2 default_value } class_data[number_of_class_data] u2 bootstrap_method_attr_index; u2 name_and_type_index; } The class data descriptor is a field descriptor that describes the class of the class_data, it should be a class among int, long, float, double, String, MethodType, MethodHandle, i.e. the type of the constant that can appear in the constant pool. The default value is a constant pool item that defines the value that will be used if the shadow class is created with no class_data. The bootstrap method is called to derive a shadow class from a master class if the shadow class has not yet been created yet. The bootstrap method takes a Lookup configured on the master class, a name, a Class (the type of the name_and_type) and an array of Object containing the class data as parameter (+ some eventual boostrap arguments) and returns a reference to the java.lang.invoke.Classy. The type of the name_and_type as to be a subtype of java.lang.invoke.Classy. The interface java.lang.invoke.Classy describes how to specialize a shadow class from a master class. interface Classy { Class superclass(); Class[] interfaces(); String fieldDescriptor(String field, String descriptor); MethodHandle method(String name, String descriptor); } superclass() returns the super-class on the shadow class, it has to be a specialization of master class super-class (a subtype of the master class super-class) or the master class super-class it self. interfaces() return the interfaces of the shadow class, each interface has to be a subtype of the master class corresponding interface or the corresponding interface itself. fieldDescriptor() is called for each field of the master class, with the field name and the field descriptor the master class, this method returns the field descriptor of corresponding field of the shadow class, it must be a subtype of the master class field. If null is returns, it means the field doesn't exist and a NoSuchFieldError will be thrown upon access. method() is called for each method of the master class, with the method name and its method descriptor, this method returns a method handle corresponding to the specialization of the master class method in the shadow class. The method handle type as to be exactly the same as the descriptor sent as parameter. If null is returns, it means the method doesn't exist and a NoSuchMethodError will be thrown upon access. The idea here is that a shadow class is a covariant variant of the master class, a field can be replaced by a subtype, a method can be replaced by a specialized variant with the same parameter types. This allow any shadow call to be accessed using any opcode that takes the master class as owner, getfield, putfield, all invoke* opcodes. For getfield, a value-type can be buffered by the VM to Object/an interface. For putfield, the VM as to perform an extra check at runtime (like there is an extra check for arraystore because arrays are covariant). The interface Classy can be used by the VM at anypoint in time, so calls to method can be lazy or not (the other informations are needed to determine the layout of a class so they can not be called lazily). At runtime, for the VM, an instance of a shadow class is a subtype of a master class. The fact that the shadow class is a subtype of the master class allows to desugar wildcards in Java as the master class. A shadow class has no special encoding in the bytecode, it only has a representation in the runtime data structure of the VM. In order to be be backward compatible, java.lang.Class is extended to also represents shadow classes, java.lang.Class is extended by the following methods: - Class withClassData(Object... data) that returns the shadow class of a master class. - Object[] getClassData() that returns the class data of a shadow class or null. - boolean isMasterClass() return if current class is a master class. - Class getMasterclass() that returns the master class of a shadow class or the current class otherwise (a classical class is it's own master class). Reusing java.lang.Class to represent shadow classes at runtime is important because it allows reflection and java.lang.invoke to works seamlessly with the shadow class because from a user point of view, a classical class and a shadow class are all java.lang.Class. There is a compatibility issue with Object.getClass(), isInstance, instanceof and checkcast, they can not can not returns/uses the shadow class because a code like this o.getClass() == ArrayList.class or o instanceof ArrayList will not work if the comparison uses the shadow class. This means that getClass(), instanceof and checkcast need to check the master class of the shadow class instead of using the shadow class directly. Note that this problem is not inherent to the shadow class, it's an artifact of the fact that the type argument is reified. This means that we have to introduce a least a supplementary methods for getClass(), a static method in class, Class.getTheTrueRealClass(Object o) is enough, it also means that if we want to allow reified cast/instanceof in Java/.class notation, this will have to be implemented using invokedynamic/condy (again to avoid to avoid to bolt the Java generics semantics in the VM). We may also choose to not support reified cast/instanceof in Java, given that being able to specialized fields/methods is more important in term of performance and that we will not support reified generics of objects anyway. The fact that a shadow class has a representation in the classfile means that we are loosing information because if ArrayList is anyfied, ArrayList list = ... list.get(3) list.get() is encoded in the bytecode as a calls to the master class ArrayList and not a class to the shadow class ArrayList, so a call to an anyfied generics is still erased, but given that this information is available at runtime (the inlining cache stores the shadow class), a JIT can easily inline the call. With the classfile only containing classical descriptor, in term of opcodes we need only to add to support few operations - new on an anyfied class - new on an anyfied array - invocation of an anyfield method. for all theses operations, the idea is to send the class data (method data) by storing them on the stack and have a bytecode that describe them as class data/method data. We also need to way to get the method data inside the method on stack. I propose to introduce two new opcodes, dataload and datastore, - dataload is constructed with a concatenation of field descriptors as parameter (or a method descriptor with no parens and return type) and takes all values on stack and store them in a side channel. - datastore also takes a concatenation of field descriptors as parameter and extract the data from the side channel to the stack. dataload is used as prefix of anew, anewarray to pass the class data that will be used to build the shadow class (if not already created) dataload is used as prefix of all invoke* bytecode to pass the method data. We also need a special reflection method in Thread, getMethodData() that returns the method data associated to the current method as an array or null if no method data was pass when the method was called. Note that when invokedynamic is perfix by a dataload, the bootstrap method has no access to the data, only the target of the callsite will see the method data. To summarize, i propose to implement reified generics in the VM by introducing the notion of shadow class, a class only available at runtime that has associated class data and a user defined way to do fields and methods specialization at runtime. The main advantages of the solution is that old classes will not only be able to use anyfied generics but old code will be also optimized by JITs as if it was a new code. regards, R?mi From forax at univ-mlv.fr Tue Sep 18 10:37:45 2018 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 18 Sep 2018 12:37:45 +0200 (CEST) Subject: Reified generics - shadow class edition In-Reply-To: <1329608472.207285.1537266736467.JavaMail.zimbra@u-pem.fr> References: <1329608472.207285.1537266736467.JavaMail.zimbra@u-pem.fr> Message-ID: <1205675994.207756.1537267065926.JavaMail.zimbra@u-pem.fr> Errata, the method Classy.fieldType() should return a Class not a field descriptor. interface Classy { Class superclass(); Class[] interfaces(); Class fieldType(String field, String descriptor); MethodHandle method(String name, String descriptor); } R?mi ----- Mail original ----- > De: "Remi Forax" > ?: "valhalla-spec-experts" > Envoy?: Mardi 18 Septembre 2018 12:32:16 > Objet: Reified generics - shadow class edition > Reified generics - shadow class edition. > > I believe that trying to make method descriptor variant is a bad idea, it comes > from the model 1...3 experimentation but it's an artifact of such > implementations, not a concept. > Here i describe a way to keep generics erased even if they are reified. > > If the descriptor is erased, we need a way to get reified type argument at > runtime so you can use 'checkcast' to verify that the arguments that are > parameterized have the right type. > By example > > class Holder { > E element; > > // can be specialized to throw a NoSuchMethodError if E is void > E get() { > return element; > } > > > void set(E element) { > this.element = element; > } > } > will be translated into > class Holder { > Object element; > > Object get() { > return element; > } > void set(Object element) { > element checkcast Es > // verify that element is the type argument of E here > this.element = element; > } > } > > Now to bridge the gap, we also need: > - a way to explain to the VM at runtime that the field 'element' is specialized > (if it's a value type) > - a way explain to the VM at runtime that the methods get and set has different > implementations > > For that i proposed a new mechanism in the VM called master class/shadow class > which is a way define a specialized class, the shadow class, from a template > class, the master class. In my example, Holder is the master class and > Holder with Complex a value type is a shadow class 'derived' at > runtime from the master class. > > This mechanism is more general than just supporting type specialization in the > VM because > - we do not want to inject the Java generic semantics in the VM or the Scala > semantics, or the Kotlin semantics, etc. > - we can support more use cases, so other languages can by example associate a > constant an int to a class like in C++. > > So the idea is introduce two things that works together: > 1) implement in the VM a mechanism that allows to add constant objects as > supplementary values (class data) when defining a class > 2) use a bootstrap method (to "go meta" as John said) to allow to specialize > fields and methods of such class > > Those two features may be cleanly separated in the future, but i'm not sure how > to do that, so for now, let say they are two parts of the same feature, the > master class/shadow class feature. > > For (1), we need a class file attribute that describe the class of each class > data, we don't need to name them, it can be positional (for java generics we > may introduce another attribute or re-use one existing to find the name of the > class data if they are type parameter). > For (2), we need to specify a boostrap method that will be called to describe > how the specialization should be done. > > Considering (1) and (2) as a unique feature means you can have the same class > attribute definining the class data and the boostrap method. > The MasterClass attribute > > MasterClass_attribute > u2 attribute_name_index; > u4 attribute_length; > u2 number_of_class_data; > { > u2 descriptor > u2 default_value > } class_data[number_of_class_data] > u2 bootstrap_method_attr_index; > u2 name_and_type_index; > } > > The class data descriptor is a field descriptor that describes the class of the > class_data, it should be a class among int, long, float, double, String, > MethodType, MethodHandle, i.e. the type of the constant that can appear in the > constant pool. > The default value is a constant pool item that defines the value that will be > used if the shadow class is created with no class_data. > The bootstrap method is called to derive a shadow class from a master class if > the shadow class has not yet been created yet. The bootstrap method takes a > Lookup configured on the master class, a name, a Class (the type of the > name_and_type) and an array of Object containing the class data as parameter (+ > some eventual boostrap arguments) and returns a reference to the > java.lang.invoke.Classy. > The type of the name_and_type as to be a subtype of java.lang.invoke.Classy. > > The interface java.lang.invoke.Classy describes how to specialize a shadow class > from a master class. > interface Classy { > Class superclass(); > Class[] interfaces(); > String fieldDescriptor(String field, String descriptor); > MethodHandle method(String name, String descriptor); > } > > superclass() returns the super-class on the shadow class, it has to be a > specialization of master class super-class (a subtype of the master class > super-class) or the master class super-class it self. > interfaces() return the interfaces of the shadow class, each interface has to be > a subtype of the master class corresponding interface or the corresponding > interface itself. > fieldDescriptor() is called for each field of the master class, with the field > name and the field descriptor the master class, this method returns the field > descriptor of corresponding field of the shadow class, it must be a subtype of > the master class field. If null is returns, it means the field doesn't exist > and a NoSuchFieldError will be thrown upon access. > method() is called for each method of the master class, with the method name and > its method descriptor, this method returns a method handle corresponding to the > specialization of the master class method in the shadow class. The method > handle type as to be exactly the same as the descriptor sent as parameter. If > null is returns, it means the method doesn't exist and a NoSuchMethodError will > be thrown upon access. > > The idea here is that a shadow class is a covariant variant of the master class, > a field can be replaced by a subtype, a method can be replaced by a specialized > variant with the same parameter types. This allow any shadow call to be > accessed using any opcode that takes the master class as owner, getfield, > putfield, all invoke* opcodes. For getfield, a value-type can be buffered by > the VM to Object/an interface. For putfield, the VM as to perform an extra > check at runtime (like there is an extra check for arraystore because arrays > are covariant). > > The interface Classy can be used by the VM at anypoint in time, so calls to > method can be lazy or not (the other informations are needed to determine the > layout of a class so they can not be called lazily). > > At runtime, for the VM, an instance of a shadow class is a subtype of a master > class. > > The fact that the shadow class is a subtype of the master class allows to > desugar wildcards in Java as the master class. > A shadow class has no special encoding in the bytecode, it only has a > representation in the runtime data structure of the VM. > > In order to be be backward compatible, java.lang.Class is extended to also > represents shadow classes, java.lang.Class is extended by the following > methods: > - Class withClassData(Object... data) that returns the shadow class of a > master class. > - Object[] getClassData() that returns the class data of a shadow class or null. > - boolean isMasterClass() return if current class is a master class. > - Class getMasterclass() that returns the master class of a shadow class or > the current class otherwise (a classical class is it's own master class). > > Reusing java.lang.Class to represent shadow classes at runtime is important > because it allows reflection and java.lang.invoke to works seamlessly with the > shadow class because from a user point of view, a classical class and a shadow > class are all java.lang.Class. > > There is a compatibility issue with Object.getClass(), isInstance, instanceof > and checkcast, they can not can not returns/uses the shadow class because a > code like this o.getClass() == ArrayList.class or o instanceof ArrayList will > not work if the comparison uses the shadow class. This means that getClass(), > instanceof and checkcast need to check the master class of the shadow class > instead of using the shadow class directly. > Note that this problem is not inherent to the shadow class, it's an artifact of > the fact that the type argument is reified. > > This means that we have to introduce a least a supplementary methods for > getClass(), a static method in class, Class.getTheTrueRealClass(Object o) is > enough, it also means that if we want to allow reified cast/instanceof in > Java/.class notation, this will have to be implemented using > invokedynamic/condy (again to avoid to avoid to bolt the Java generics > semantics in the VM). We may also choose to not support reified cast/instanceof > in Java, given that being able to specialized fields/methods is more important > in term of performance and that we will not support reified generics of objects > anyway. > > > The fact that a shadow class has a representation in the classfile means that we > are loosing information because if ArrayList is anyfied, > ArrayList list = ... > list.get(3) > list.get() is encoded in the bytecode as a calls to the master class ArrayList > and not a class to the shadow class ArrayList, so a call to an anyfied generics > is still erased, but given that this information is available at runtime (the > inlining cache stores the shadow class), a JIT can easily inline the call. > > With the classfile only containing classical descriptor, in term of opcodes we > need only to add to support few operations > - new on an anyfied class > - new on an anyfied array > - invocation of an anyfield method. > for all theses operations, the idea is to send the class data (method data) by > storing them on the stack and have a bytecode that describe them as class > data/method data. > We also need to way to get the method data inside the method on stack. > > I propose to introduce two new opcodes, dataload and datastore, > - dataload is constructed with a concatenation of field descriptors as parameter > (or a method descriptor with no parens and return type) and takes all values on > stack and store them in a side channel. > - datastore also takes a concatenation of field descriptors as parameter and > extract the data from the side channel to the stack. > > dataload is used as prefix of anew, anewarray to pass the class data that will > be used to build the shadow class (if not already created) > dataload is used as prefix of all invoke* bytecode to pass the method data. > > We also need a special reflection method in Thread, getMethodData() that returns > the method data associated to the current method as an array or null if no > method data was pass when the method was called. > > Note that when invokedynamic is perfix by a dataload, the bootstrap method has > no access to the data, only the target of the callsite will see the method > data. > > > To summarize, i propose to implement reified generics in the VM by introducing > the notion of shadow class, a class only available at runtime that has > associated class data and a user defined way to do fields and methods > specialization at runtime. The main advantages of the solution is that old > classes will not only be able to use anyfied generics but old code will be also > optimized by JITs as if it was a new code. > > > regards, > R?mi From karen.kinnear at oracle.com Tue Sep 25 13:02:13 2018 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Tue, 25 Sep 2018 09:02:13 -0400 Subject: No Valhalla EG meeting Wednesday Sept 26 Message-ID: <2D1BBECF-DD00-42F6-8414-4155F03EDC6B@oracle.com> We will NOT be meeting Wednesday Sept 26, our next meeting will be October 10. Many thanks to those who joined us for the offsite in Burlington both physically and via zoom. I will send a summary of the offsite as soon as I can. thanks, Karen