From bjornvar at ca.ibm.com Wed Jun 1 18:52:08 2016 From: bjornvar at ca.ibm.com (Bjorn B Vardal) Date: Wed, 1 Jun 2016 18:52:08 +0000 Subject: species static prototype In-Reply-To: <5748B465.3060308@oracle.com> References: <5748B465.3060308@oracle.com> Message-ID: <20160601185214.A5BB813603C@b03ledav002.gho.boulder.ibm.com> An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Jun 1 18:56:28 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 1 Jun 2016 14:56:28 -0400 Subject: species static prototype In-Reply-To: <20160601185214.A5BB813603C@b03ledav002.gho.boulder.ibm.com> References: <5748B465.3060308@oracle.com> <20160601185214.A5BB813603C@b03ledav002.gho.boulder.ibm.com> Message-ID: <6874e5dd-2966-33ce-caf5-4bb8afd76234@oracle.com> On 6/1/2016 2:52 PM, Bjorn B Vardal wrote: > Will the users be able to write their own ? > > * class Foo { > o __species { > + ... > } > } > I would assume so; even if we don't support a __species { } block, the user can still contribute to the species initialization with field initializers: __species int x = 3; So I see no reason to not adopt symmetry with static here. > Your access bridge solution using species methods looks fine, but are > we not solving that with nest mates? We now have two credible solutions. Before we had species-static, nestmates were basically a forced move; now its an optional move. > I'm also wondering whether the following are typos, or if I > misunderstood them: > > * TestResolution.m_I() was not meant to be decorated with '__species' > * TestForwardRef2.s1_S and TestForwardRef2.s2_SS don't have the > correct modifiers, or should not be error cases. > * TestTypeVar.m_I() was not meant to be decorated with '__species' > I'll let Maurizio answer these. -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Wed Jun 1 20:19:22 2016 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 1 Jun 2016 21:19:22 +0100 Subject: species static prototype In-Reply-To: <20160601185214.A5BB813603C@b03ledav002.gho.boulder.ibm.com> References: <5748B465.3060308@oracle.com> <20160601185214.A5BB813603C@b03ledav002.gho.boulder.ibm.com> Message-ID: <574F434A.4020303@oracle.com> On 01/06/16 19:52, Bjorn B Vardal wrote: > Will the users be able to write their own ? > > * class Foo { > o __species { > + ... > } > } > > Hi Bjorn, Yep - that is supported. > Your access bridge solution using species methods looks fine, but are > we not solving that with nest mates? > I'm also wondering whether the following are typos, or if I > misunderstood them: > > * TestResolution.m_I() was not meant to be decorated with '__species' > Right - that's a type, the 'species' modifier was meant to be omitted (i.e. it's an instance method) > > * TestForwardRef2.s1_S and TestForwardRef2.s2_SS don't have the > correct modifiers, or should not be error cases. > Yeah - missing static and species there - in general members with _S are meant to be static, those with _SS are meant to be 'species' > > * TestTypeVar.m_I() was not meant to be decorated with '__species' > Yep - same as above Sorry for the typos! Maurizio > -- > Bj?rn V?rdal > IBM Runtimes > > ----- Original message ----- > From: Maurizio Cimadamore > Sent by: "valhalla-spec-experts" > > To: valhalla-spec-experts at openjdk.java.net > Cc: > Subject: species static prototype > Date: Fri, May 27, 2016 4:56 PM > > Hi, > over the last few days I've been busy putting together a prototype > [1, 2] of javac/runtime support for species static. I guess it > could be considered an prototype implementation of the approach > that Bjorn has described as "Repurpose existing statics" [4] in > his nice writeup. Here's what I have learned during the experience. > > Parser > ==== > > The prototype uses a no fuss approach where '__species' is the > modifier to denote species static stuff (of course a better syntax > will have to be picked at some point, but that's not the goal of > the current exercise). This means you can write: > > class Foo { > String i; //instance field > static String s; //static field > __species String ss; //species static field > } > > This is obviously good enough for the time being. > > A complication with parsing occurs when accessing species members; > in fact, species members can be accessed via their fully qualified > type (including all required type-arguments, if necessary). > > Foo.ss; > Foo.ss; > > The above are all valid species access expression. Now, adding > this kind of support in the parser is always tricky - as we have > to battle with ambiguities which might pop up. Luckily, this > pattern is similar enough to the one we use for method references > - i.e. : > > Foo::ss > > Which the compiler already had to special case; so I ended up > slightly generalizing what we did in JDK 8 method reference > parsing, and I got something working reasonably quick. But this > could be an area where coming up with a clean spec might be tricky > (as the impl uses abundant lookahead to disambiguate this one). > > Resolution > ====== > > The basic idea is to divide the world in three static levels, > whose properties are summarized in the table below: > enclosing type enclosing instance > instance yes yes > species yes no > static no no > > > So, in terms of who can access what, it follows that if we > consider 'instance' to be the highest static level and 'static' to > be the lowest, then it's ok for a member with static level S1 to > access another member of static level S2 provided that S1 >= S2. > Or, with a table: > from/to instance species static > instance yes yes yes > species no yes yes > static no no yes > > > > So, let's look at a concrete example: > > class TestResolution { > static void m_S() { > m_S(); //ok > m_SS(); //error > m_I(); //error > } > > __species void m_SS() { > m_S(); //ok > m_SS(); //ok > m_I(); //error > } > > __species void m_I() { > m_S(); //ok > m_SS(); //ok > m_I(); //ok > } > } > > A crucial property, of course, is that species static members can > reference to any type vars in the enclosing context: > > class TestTypeVar { > static void m_S() { > X x; //error > } > > __species void m_SS() { > X x; //ok > } > __species void m_I() { > X x; //ok > } > } > > Nesting > ===== > > Another concept that needs generalization is that of allowed > nesting; consider the following program: > > class TestNesting1 { > class MemberInner { > static String s_S; //error > String s_I; //ok > } > > static class StaticInner { > static String s_S; //ok > String s_I; //ok > } > } > > That is, the compiler will only allow you to declare static > members in toplevel classes or in static nested classes (which, > after all, act as toplevel classes). Now that we are adding a new > static level to the picture, how are the nesting rules affected? > > Looking at the table above, if we consider 'instance' to be the > highest static level and 'static' to be the lowest, then it's ok > for a member with static level S1 to declare a member of static > level S2 provided that S1 <= S2. Again, we can look at this in a > tabular fashion: > declaring/declared instance species static > instance yes no no > species yes yes no > static yes yes yes > > > This also seems like a nice generalization of the current rules. > The rationale behind these rules is to basically, guarantee some > invariants during member lookup; let's say that we are in a nested > class with static level S1 - then, by the rule above, it follows > that any member nested in this class will be able to access > another member with static level S1 declared in this class or in > any lexically enclosing class. > > A full example of nesting rules is given below: > > class TestNesting2 { > class MemberInner { > static String s_S; //error > __species String s_SS; //error > String s_I; //ok > } > > > __species class StaticInner { > static String s_S; //error > __species String s_SS; //ok > String s_I; //ok > } > > static class StaticInner { > static String s_S; //ok > __species String s_SS; //ok > String s_I; //ok > } > } > > Unchecked access > =========== > > Because of an unfortunate interplay between species and erasure, > code using species members is potentially unsound (the example > below is a variation of an example first discovered by Peter's > example [3] in this very mailing list): > > public class Foo { > __species T cache; > } > > > Foo.cache = "Hello"; > Integer i = Foo.cache; //whoops > > To prevent cases like these, the compiler implements a check which > looks at the qualifier of a species access; if such qualifier > (either explicit, or implicit) cannot be proven to be reifiable, > an unchecked warning is issued. > > Note that it is possible to restrict such warnings only to cases > where the signature of the accessed species static member changes > under erasure. E.g. in the above example, accessing 'cache' is > unchecked, because the type of 'cache' contains type-variables; > but if another species static field was accessed whose type did > not depend on type-variables, then the access should be considered > sound. > > > Species initializers > =========== > > In our model we have three static levels - but we have > initialization artifacts for only two of those; we need to fix that: > instance > species > static > > > > That is, a new method is added to a class containing one > or more species variables with an initializer. This method is used > to hoist the initialization code for all the species variables. > > Forward references > ============ > > Rules for detecting forward references have to be extended > accordingly. A forward reference occurs whenever there's an > attempt to reference a variable from a position P, where the > variable declaration occurs in a position P' > P. Currently, the > rules for forward references allow an instance variable to > forward-reference a static variable - as shown below: > > class TestForwardRef { > String s = s_S; > static String s_S = "Hello!"; > } > > The rationale behind this is that, by the time we see the instance > initializer for 's' we would have already executed the code for > initializing 's_S' (as initialization will occur in different > methods, and respectively, see section above). > With the new static level, the forward reference rules have to be > redefined according to the table below: > > from/to instance species static > instance forward ref ok ok > species illegal forward ref ok > static illegal illegal forward ref > > > In other words, it's ok to forward reference a variable whose > static level is lower than that available where the reference > occurs. An example is given below: > > class TestForwardRef2 { > String s1_I = s_S; //ok > String s2_I = s_SS; //ok > > String s1_S = s_S; //error! > > String s1_SS = s_S; //ok > String s2_SS = s_SS; //error! > > static String s_S = "Hello!"; > __species String s_SS = "Hello Species!"; > } > > This is an extension of the above principle: since instance > variables are initialized in , they can reference variables > initialized in or . If a variable is initialized > in it can similarly safely reference a variable > initialized in . Another way to think of this is that a > forward reference error only occurs if the static level of the > referenced symbol is the same as the static level where the > reference occurs. All other cases are either illegal (i.e. because > it's an attempt to go from a lower static level to an higher one) > or valid (because it can be guaranteed that the code initializing > the referenced variable has already been executed). > > Code generation > ========== > > Javac currently emits invokestatic/getstatic/putstatic for both > legacy static and species static access. javac will use the > 'owner' field of a CONSTANT_MethodRef, CONSTANT_FieldRef constants > to point to the sharp type of the species access (through a > constant pool type entry). Static access will always see an erased > owner. > > Consider this example: > > class TestGen { > __species void m_SS() { } > static void m_S() { } > > public static void main(String args) { > TestGen.m_SS(); > TestGen.m_SS(); > TestGen.m_S(); > TestGen.m_S(); > } > } > > The generated code in the 'main' method is reported below: > > 0: invokestatic #11 // Method TestGen<_>.m_SS:()V > 3: invokestatic #15 // Method TestGen.m_SS:()V > 6: invokestatic #18 // Method TestGen<_>.m_S:()V > 9: invokestatic #18 // Method TestGen<_>.m_S:()V > > As it can be seen, species static access can cause a sharper type > to end up in the 'owner' field of the member reference info; on > the other hand, a static access always lead to an erased 'owner'. > > Another detail worth mentioning is how __species is represented in > the bytecode. Given the current lack of flags bit I've opted to > use the last remaining bit 0x8000 - this is in fact the last > unused bit that can be shared across class, field and method > descriptors. Actually, this bit has already been used to encode > the ACC_MANDATED flag in the MethodParameters attribute (as of JDK > 8) - but since there's no other usage of that flag configuration > outside MethodParameters it would seem safe to recycle it. Of > course more compact approaches are also possible, but they would > lead to different flag configurations for species static fields, > methods and classes. > > Specialization > ========= > > Specializing species access is relatively straightforward: > > * both instance and species static members are copied in the > specialization > * static members are only copied in the erased specialization (and > skipped otherwise) > * ACC_SPECIES classes become regular classes when specialized > * ACC_SPECIES methods/fields become static methods/fields in the > specialization > * becomes the new in the specialization (and is > omitted if the specialization is the erased specialization) > > The last bullet requires some extra care when handling the > 'erased' specialization; consider the following example: > > class TestSpec { > static String s_S = "HelloStatic"; > __species String s_SS = "HelloSpecies"; > } > > This class will end up with the following two synthetic methods: > > static void (); > descriptor: ()V > flags: ACC_STATIC > Code: > stack=1, locals=0, args_size=0 > 0: ldc #8 // String HelloStatic > 2: putstatic #14 // Field > s_S:Ljava/lang/String; > 5: ldc #16 // String HelloSpecies > 7: putstatic #19 // Field > s_SS:Ljava/lang/String; > 10: return > > species void (); > descriptor: ()V > flags: ACC_SPECIES > Code: > stack=1, locals=1, args_size=1 > 0: ldc #16 // String HelloSpecies > 2: putstatic #19 // Field > s_SS:Ljava/lang/String; > 5: return > > As it can be seen, the method contains initialization > code for both static and species static fields! To understand why > this is so, let's consider how the specialized bits might be > derived from the template class following the rules above. Let's > consider a specialization like TestSpec: in this case, we > need to drop (it's a static method and TestSpec is > not an erased specialization), and we also need to rename > as in the new specialization. All is fine - the > specialization will contain the relevant code required to > initialize its species static fields. > > Let's now turn to the erased specialization TestSpec<_> - this > specialization receives both static and species static members. > Now, if we were to follow the same rules for initializers, we'd > end up with two different initializer methods - both and > . We could ask the specializer to merge them somehow, but > that would be tricky and expensive. Instead, we simply (i) drop > from the erased specialization and (ii) retain . > Of course this means that must also contain > initialization code for species static members. > > Bonus point: Generic methods > =================== > > As pointed out by Brian, if we have species static classes we can > translate static and species static specializable generic methods > quite effectively. Consider this example: > > class TestGenMethods { > static void m(X x) { ... } > > void test() { > m(42); > } > } > > without species static, this would translate to: > > class TestGenMethods { > static class TestGenMethods$m { > void m(X z) { ... } > } > > /* bridge */ void m(Object o) { new TestGenMethods$m().m(o); } > > void test() { > new TestGenMethod$m().m(42); // this is really done > inside the BSM > } > } > > Note how the bridge (called by legacy code) will need to spin a > new instance of the synthetic class and then call a method on it. > The bootstrap used to dispatch static generic specializable calls > also needs to do a very similar operation. But what if we turned > the translated generic method into a species static method? > > class TestGenMethods { > class TestGenMethods$m { > __species void m(X z) { ... } > } > > /* bridge */ void m(Object o) { TestGenMethods$m.m(o); } > > void test() { > TestGenMethod$m.m(42); // this is really done inside > the BSM > } > } > > With species static, we can now access the method w/o needing any > extra instance. This leads to simplification in both the bridging > strategy and the bootstrap implementation. We can apply a similar > simplification for dispatch of specializable species static calls > - the only difference is that the synthetic holder class has also > to be marked as species static (since it could access type-vars > from the enclosing context). > > Bonus point: Access bridges > ================= > > Access bridges are a constant pain in the current translation > strategy; such bridges are generated by the compiler to grant > access to otherwise inaccessible members. Example: > > class Outer { > private void m() { } > > class Inner { > void test() { > m(); > } > } > } > > This code will be translated as follows: > > class Outer { > > /* synthetic */ static access$m(Outer o) { o.m(); } > > private void m() { } > > class Inner { > /*synthetic*/ Outer this$0; > > void test() { > access$m(this$0); > } > } > } > > That is, access to private members is translated with an access to > an accessor bridge, which then performs access from the right > location. Note that the accessor bridge is static (because > otherwise it would be possible to maliciously override it to grant > access to otherwise inaccessible members); since it's static, > usual rules apply, so it cannot refer to type-variables, it cannot > be specialized, etc. This means that there are cases with > specialization where existing access bridge are not enough to > guarantee access - if the access happens to cross specialization > boundaries (i.e. accessing m() from an Outer.Inner). > > Again, species static comes to the rescue: > > class Outer { > > /* synthetic */ __species access$m(Outer o) { o.m(); } > > private void m() { } > > class Inner { > /*synthetic*/ Outer this$0; > > void test() { > Outer.access$m(this$0); > } > } > } > > Since the accessor bridge is now species static, it means it can > now mention type variables (such as X); and it also means that > when the bridge is accessed (from Inner), the qualifier type > (Outer) is guaranteed to remain sharp from the source code to > the bytecode - which means that when this code will get > specialized, all references to X will be dealt with accordingly > (and the right accessor bridge will be accessed). > > Parting thoughts > ========== > > On many levels, species statics seem to be the missing ingredient > for implementing many of the tricks of our translation strategy, > as well as to make it easier to express common idioms (i.e. > type-dependent caches) in user code. > > Adding support for species static has proven to be harder than > originally thought. This is mainly because the current world is > split in two static levels: static and instance. When something is > not static it's implicitly assumed to be instance, and viceversa. > If we add a third static level to the picture, a lot of the > existing code just doesn't work anymore, or has to be validated to > check as to whether 'static' means 'legacy static' or 'species > static' (or both). > > I started the implementation by treating static, species static > and instance as completely separate static levels - with different > internal flags, etc. but I soon realized that, while clean, this > approach was invalidating too much of the existing implementation. > More specifically, all the code snippets checking for static would > now have been updated to check for static OR species static > (overriding vs. hiding, access to 'this', access to 'super', > generic bridges, ...). On the other hand, the places where the > semantics of species static vs. static was different were quite > limited: > > * membership/type substitution: a species static behaves like an > instance member; the type variables of the owner are replaced into > the member signature. > * resolution: we need to implement the correct access rules as > shown in the tables above. > * code generation: an invokestatic involving a species static gets > a sharp qualifier type > > This quickly led to the realization that it was instead easier to > just treat 'species static' as a special case of 'static' - and > then to add finer grained logic whenever we really needed the > distinction. This led to a considerably easier patch, and I think > that a similar consideration will hold for the JLS. > > [1] - > http://hg.openjdk.java.net/valhalla/valhalla/langtools/rev/6949c3d06e8f > [2] - > http://hg.openjdk.java.net/valhalla/valhalla/jdk/rev/836efde938c1 > [3] - > http://mail.openjdk.java.net/pipermail/valhalla-spec-experts/2016-February/000096.html > [4] - > http://mail.openjdk.java.net/pipermail/valhalla-spec-experts/2016-May/000147.html > > Maurizio > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bjornvar at ca.ibm.com Wed Jun 1 20:59:22 2016 From: bjornvar at ca.ibm.com (Bjorn B Vardal) Date: Wed, 1 Jun 2016 20:59:22 +0000 Subject: Compatibility goals In-Reply-To: <07de2a60-6b4e-b597-1fca-2f3af30fc7f0@oracle.com> References: <07de2a60-6b4e-b597-1fca-2f3af30fc7f0@oracle.com> Message-ID: <20160601205928.3756D124044@b01ledav002.gho.pok.ibm.com> An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Jun 1 21:44:57 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 1 Jun 2016 17:44:57 -0400 Subject: Compatibility goals In-Reply-To: <20160601205928.3756D124044@b01ledav002.gho.pok.ibm.com> References: <07de2a60-6b4e-b597-1fca-2f3af30fc7f0@oracle.com> <20160601205928.3756D124044@b01ledav002.gho.pok.ibm.com> Message-ID: <90608fa1-89e6-d907-9c97-83b3a070e394@oracle.com> > /> Alpha-renaming a type variable (to a non-shadowed name) should be > binary and source compatible./ > The name is only used internally in the generic class in the > GenericClass attribute, and recompiling with different names will > therefore not affect users of the generic class. Right. (Like method parameter names, the name is not part of the API, and exist only to improve the readability of the implementation code.) > /> Reordering or removing type variables is not compatible. (These > first two together match the story for method argument lists; you can > rename method arguments, but not reorder or remove them.)/ > Other classes will refer to the generic class using ParamTypes in > their CPs. ParamType provides the type parameters in the order that > the generic type specified at compilation time. Reordering and > recompiling will therefore invalidate all ParamTypes referring to the > modified generic type. Right. Once you've published Foo, clients or subclasses may have Foo in their source files and ParamType[Foo, A, B] in their binaries, which they expect to retain their meaning. Dropping or reordering parameters would render these client / subclasses broken. > /> Anyfying an existing erased type variable should be binary and > source compatible./ > All ParamTypes referring to a ref-generic type variable will be > providing a reference type (erased) as the type parameter (or no > parameters?). As references are a subset of any, anyfying the type > variable does not invalidate existing ParamTypes. > I have one question here: What happens if I refer to Foo (not any > T) using ParamType[Foo, String]? Is it valid because String is a > reference type, or invalid because Foo is not specializable? There are two migration situations here: - Migrating a totally erased generic class to any-generic (Foo to Foo) - Migrating a partially anyfied class (Foo to Foo) For the former, there will be no ParamType entries, all references to Foo will be LFoo; / Constant_Class[Foo]. For the latter, there will be ParamType entries that specify 'erased' in the appropriate position. In either case, these remain valid parameterizations after the migration. To your question: I would say this is invalid, because Foo is not specializable / lacks a GenericClass attribute. > /> Adding a new type variableat the endof the argument list should be > binary compatible (though not source compatible.) Adding a new type > variable other than at the end is not compatible./ > The last point already said that we have to support missing type > parameters, and this point is really just and extension of that. If a > type parameter is not provided, the type variable is assumed to be erased. Right. Also, this one interacts with the story for inner classes, and influences the decision about how to represent enclosing class type parameters in ParamType (do we have a chain of ParamType, as proposed by the M3 doc, or do we lift all type parameters to the innermost class?) The chain approach seems to reduce the impact of generifying an enclosing class (as per the next item.) > /> Generifying an enclosing scope (evolving|Outer.Inner|to > |Outer.Inner|) should be binary compatible./ > At first glance, this might look like anyfying an existing erased type > or generifying a non-generic class. However, the complicating factor > is that the added type variable will also be added to the scope of the > enclosed class, and the question becomes whether we can handle this. > An enclosed class must be compiled with its enclosing class, so the > GenericClass attribute will be updated correctly. The type parameters > to Inner and Outer are provided separately, and any missing type > parameter will still be treated as erased. Right. Also, with the chain-of-enclosing-descriptors approach, it is fairly easy for a ParamType[parent=Outer, Inner, U] to recover from new parameters being added to Outer, whereas if we simply lifted the Outer parameters onto Inner, now we'd have a difficult time to reconstruct the actual parameterization. > /> Changing type variable bounds is not binary compatible./ > Type variables are erased to their bound, i.e. not necessarily > j.l.Object. Any descriptor that contained a type variable will > therefore contain the bound after compilation. Changing the bound > invalidates the descriptors in existing method refs, and is therefore > binary incompatible. Also, this is not a new constraint, as it already > applies to erased generics. Right. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bjornvar at ca.ibm.com Thu Jun 2 20:48:57 2016 From: bjornvar at ca.ibm.com (Bjorn B Vardal) Date: Thu, 2 Jun 2016 20:48:57 +0000 Subject: Species-static members vs singletons In-Reply-To: <4219406f-d842-97f4-7206-3a91ffe1e75c@oracle.com> References: <4219406f-d842-97f4-7206-3a91ffe1e75c@oracle.com> Message-ID: <20160602204902.88FD2AE051@b01ledav005.gho.pok.ibm.com> An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Jun 2 22:24:21 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 2 Jun 2016 18:24:21 -0400 Subject: Species-static members vs singletons In-Reply-To: <20160602204903.738387805C@b03ledav004.gho.boulder.ibm.com> References: <4219406f-d842-97f4-7206-3a91ffe1e75c@oracle.com> <20160602204903.738387805C@b03ledav004.gho.boulder.ibm.com> Message-ID: <98756340-778b-46c6-20c0-e87383b5c109@oracle.com> > Brian,**I see that you mentioned that generic instance methods are > messier. > > The translation for generic instance methods is still somewhat > messier (will post separately), but still less messy than if we also > had to manage / cache a receiver. > Is this issue part of that mess? Do you have a solution, or is this an > open issue? I tried making m species statics with a receiver argument, > but that makes the invocation non-virtual. Here?s some more notes on how we might translate generic methods using species-static methods and nested classes. General strategy * A {static,species,instance} generic method m() in |Foo| is desugared into a species method |m()| in a {static,species,instance} nested class |Foo$m|. * The accessibility of the method |m()| is lifted onto the class |Foo$m|. * Foo also gets an erased bridge, that redirects to the erased invocation of the generic method (binary compatibility only.) * A generic method is invoked with indy. The indy call site statically captures the type parameters for the invocation, some representation of the owning class Foo, and the method m(). The dynamic argument list captures the Foo-valued receiver (for instance methods) and the arguments to the generic method. Goals * Dispatch should be fast :) * It would be nice if the name-mangling strategy (|Foo$m|) were private to |Foo| ? that it does not appear in bytecode of Foo?s clients or subclasses. * It would be ideal if we could express what we want with bytecode alone, not indy, but that does not seem possible in all cases at this time. Static methods Given source code: |class Foo { ACC static void m(U u) { } } | We translate this as: |@Generic class Foo { // for binary compatibility only @Bridge ACC static void m(Object u) { Foo.Foo$mm(u); } @Generic ACC static class Foo$m { ACC species void m(U u) { } } } | An invocation |Foo.m(u)| (where U is known statically) can be translated in bytecode as |invokespecies Foo.Foo$m.m(U) | However, since it was a goal to not let the mangled name |Foo$m| leak into client code, we can easily wrap this with an indy: |invokedynamic GenericStaticMethod[Foo, "m", descriptor, U](u) ^bootstrap ^static args ^dyn args | and let the bootstrap put together the class name |Foo$m| from constituent parts (the bootstrap and the compiler share a conspiratorial connection, but the client bytecode doesn?t participate). Since everything is static at the call site, we can link to a |ConstantCallSite| that always dispatches to an |MH[invokespecies Foo$m.m(U)]|. Species methods Species-static generic methods are translated almost the same way; given class |class Foo { ACC species void m(T t, U u) { } } | we translate as |@Generic class Foo { // for binary compatibility only @Bridge ACC species void m(Object t, Object u) { Foo.Foo$mm(t, u); } @Generic ACC species class Foo$m { ACC species void m(T t, U u) { } } } | and an invocation |Foo.m(T,U)| is translated as |invokespecies Foo.Foo$m.m(T, U) | Instance methods Our translation strategy ? desugaring to a helper class ? introduces some challenges in instance method dispatch. |class Foo { ACC void m(T t, U u) { } } class Bar extends Foo { @Override ACC void m(T t, U u) { } } | I am proposing we translate this as: |@Generic class Foo { // for binary compatibility only @Bridge ACC void m(Object t, Object u) { this.Foo$mm(t, u); } @Generic ACC species class Foo$m { ACC species void m(Foo outer, T t, U u) { } } } @Generic class Bar { // for binary compatibility only @Bridge ACC void m(Object t, Object u) { this.Bar$mm(t, u); } @Generic ACC species class Bar$m { ACC species void m(Bar outer, T t, U u) { } } } | In this proposal, |Bar$m| does not extend |Foo$m|; this is to avoid leaking dependence on desugaring in subclass bytecode. The implementation methods in |Xxx$m| take an extra ?outer? argument, which is the receiver for the instance generic method invocation. The use of species-static methods for the implementation methods mean that we need not maintain instances of Foo$m, but instead can pass the actual Fooreceiver directly to the implementation. For an invocation: |Foo f = ... f.m(t,u) | We translate with indy as: |invokedynamic InstanceStaticMethod[ParamType[Foo,T], "m", descriptor, U](r,t,u) | The static receiver type ? here |Foo| ? is a static parameter to the bootstrap. Let?s call this class SR (the actual target will be |Xxx$m| where |Xxx| may be SR or a subclass of SR.) The dynamic receiver (a |Foo|) is passed in the dynamic argument list as |r|. The linking of the callsite is somewhat complex, but should optimize reasonably well. It proceeds as follows: * For each (static receiver class, method, specialization args) ? all static properties of the callsite ? there is a dispatch table, which is found statically at linkage time and stored as part of the callsite state; * The dispatch table is a |ClassValue|, which maps the dynamic receiver type (a subtype of SR) to a |MethodHandle| for the |invokespecies| implementation method; * Invocation performs |ClassValue.get(receiver.getClass()).invokeExact(receiver, args)| * This dispatch can be optionally wrapped with a PIC against |receiver.getClass()| The first thing the bootstrap must do (at linkage time) is compute SR. This is done by taking the owner class |Foo|, along with the method name and descriptor, computing the name-mangled class |Foo.Foo$m|, call it SR?. We then take this class and compute |SR=Crass.forSpecialization(SR', U)|. (Both of these computations are done with the classloader for |Foo|, and we should check that both SR? and SR share the classloader with |Foo|.) SR corresponds to the fully specialized class |Foo.Foo$m|. Once we?ve computed SR, we have to find the dispatch table for SR, which is a multi-step lookup first by |ClassLoader| (to allow for classloader unloading) and then by SR, which results in a |DispatchTable| mapping a dynamic receiver type |R| to a fully specialized desugared species-static MH. |class DispatchTable extends ClassValue { ... } class MetaDispatchTable extends ClassValue { ... } private static final WeakHashMap mdt = ... | We compute |mdt.computeIfAbsent(SR.getClassLoader(), ...).get(SR)| and store that as |DT| in the |CallSite|. The |ClassValue| implementation for |MetaDispatchTable| simply creates a new |ClassValue| entry. The callsite target is linked to the following logic: |Class R = r.getClass(); DT.get(R).invokeExact(r, args) | The |computeValue()| method of |DispatchTable| does the meat of the work. For the receiver type R, we have to find the corresponding |Xxx$m| class, which might be declared in a superclass of R, specialize it to the desired method type parameters, and look up (findSpecies) the appropriate specialized MH. (Finding the corresponding |Xxx$m| class could be done by walking the hierarchy directly, or by doing a |MethodHandle.resolveOrFail| on the erased bridge.) The cost of an invocation is one |ClassValue| lookup, plus the overhead of folding the arguments together appropriately and doing a MH invoke. The above logic seems representable as a single method handle expression using |fold| and |filter| combinators, but if not, might also introduce some varargs spreading/collecting overhead. (It could be further optimized by wrapping the the result of DT.get(R) with a PIC on R.) This doesn?t seem so bad. The first time a given target is resolved (a given combination of enclosing |Foo|, |m(...)|, type arguments U, and receiver class), a relatively expensive linkage step is performed ? but is then cached in a table specific to the members involved, not the call site ? so this should stabilize quickly. Interface methods Interface methods add an additional layer, but do not change the story fundamentally. If I have: |interface I { ACC void m(T t, U u) { } } class Foo implements I { @Override ACC void m(T t, U u) { } } | then we need to generate an artifact |I$m| artifact as we do with classes. When linking an invocation which static receiver is a specialization of an interface rather than a class, we compute |I$m| as our SR, and proceed as before. (In this case, our linkage strategy should probably use |resolveOrFail| rather than manual hierarchy walking, so we should probably do this in both cases.) Additionally, we may need to do a check that the |Xxx| class corresponding to the resolved |Xxx$m| holder class actually implements |I| ? again this can be done at linkage time, not dispatch time. Default methods If we do our dispatch using |resolveOrFail| against the erased bridges, and the method is not implemented in the receiver?s superclass hierarchy, then I believe that resolution will hand us the MH for the default? If, so, we?re good; we resolve to the default, just like any other implementation. One integrity risk here is that the |Xxx$m| hierarchy is properly aligned to the |Foo| hierarchy. I think we can validate that (at the time we lazily populate the dispatch table) simply by checking that the resolved erased target |Xxx.m()| and the corresponding specialized class |Xxx$m.m()| share a nest (and hence derive from the same source class.) ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Wed Jun 15 17:39:25 2016 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 15 Jun 2016 18:39:25 +0100 Subject: Valhalla reflection API - first stab Message-ID: <576192CD.9050707@oracle.com> Hi, I've just pushed a new Valhalla-centric reflection API: http://mail.openjdk.java.net/pipermail/valhalla-dev/2016-June/001968.html I'm working on a more complete document which will provide the background for the design decisions we made. I hope to make it available within the next few days. Cheers Maurizio From maurizio.cimadamore at oracle.com Wed Jun 22 16:05:30 2016 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 22 Jun 2016 17:05:30 +0100 Subject: Valhalla reflection API - first stab In-Reply-To: <576192CD.9050707@oracle.com> References: <576192CD.9050707@oracle.com> Message-ID: <576AB74A.7060905@oracle.com> Hi, as promised, here's a detailed writeup of the exploration we did in the reflection space. I hope this will be useful as a context for further discussions. Cheers Maurizio On 15/06/16 18:39, Maurizio Cimadamore wrote: > Hi, > I've just pushed a new Valhalla-centric reflection API: > > http://mail.openjdk.java.net/pipermail/valhalla-dev/2016-June/001968.html > > I'm working on a more complete document which will provide the > background for the design decisions we made. I hope to make it > available within the next few days. > > Cheers > Maurizio -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Wed Jun 22 16:26:41 2016 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 22 Jun 2016 17:26:41 +0100 Subject: Valhalla reflection API - first stab In-Reply-To: <576AB74A.7060905@oracle.com> References: <576192CD.9050707@oracle.com> <576AB74A.7060905@oracle.com> Message-ID: <576ABC41.60405@oracle.com> There seem to be problems with the attachment - uploaded here: http://cr.openjdk.java.net/~mcimadamore/reflection-manifesto.html Maurizio On 22/06/16 17:05, Maurizio Cimadamore wrote: > Hi, > as promised, here's a detailed writeup of the exploration we did in > the reflection space. I hope this will be useful as a context for > further discussions. > > Cheers > Maurizio > > On 15/06/16 18:39, Maurizio Cimadamore wrote: >> Hi, >> I've just pushed a new Valhalla-centric reflection API: >> >> http://mail.openjdk.java.net/pipermail/valhalla-dev/2016-June/001968.html >> >> >> I'm working on a more complete document which will provide the >> background for the design decisions we made. I hope to make it >> available within the next few days. >> >> Cheers >> Maurizio > From brian.goetz at oracle.com Tue Jun 28 17:43:03 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 Jun 2016 13:43:03 -0400 Subject: In-person meeting Message-ID: <795faa88-5ec4-6100-e117-e22437dfad14@oracle.com> As we have done in previous years, I would like to hold an in-person EG meeting in Santa Clara the day after JVM Language Summit (Thursday). Please contact me offline to reserve a seat.