From andrey.breslav at jetbrains.com Mon Feb 1 05:33:20 2016 From: andrey.breslav at jetbrains.com (Andrey Breslav) Date: Mon, 01 Feb 2016 05:33:20 +0000 Subject: Model 3 classfile design document In-Reply-To: <56A25E41.6040508@oracle.com> References: <56A25E41.6040508@oracle.com> Message-ID: A question about these examples: - R(Foo) = Class["Foo"] or ParameterizedType['L', "Foo", "_"] - R(Foo) = Class["Foo"] orParameterizedType['L', "Foo", "_"] - R(Foo) =ParameterizedType['L', "Foo", ArrayType[1, "I"]] Apparently, we want to preserve the information about int[], while we don't care about String. Why? Isn't int[] just a class, like String? On Fri, Jan 22, 2016 at 7:53 PM Brian Goetz wrote: > Please find a document here: > > > http://cr.openjdk.java.net/~briangoetz/valhalla/eg-attachments/model3-01.html > > that describes our current thinking for evolving the classfile format to > clearly and efficiently represent parametric polymorphism. The early > concepts of this approach were outlined in my talk at JVMLS last year; > this represents a refinement of those ideas, and a reasonable "stake in > the ground" description of what seems the most sensible way to balance > preserving parametric information in the classfile without imposing > excessive runtime costs for loading specializations. > > We're working on an updated compiler prototype which people will be able > to play with soon (along with a formal model.) > > Please ask questions! > > Some things this document does not address yet: > - How we deal with types implicit in the bytecodes (aload vs iload) > and how they get specialized; > - How we represent restricted methods in the classfile; > - How we represent the wildcard type Foo > > > -- Andrey Breslav Project Lead of Kotlin JetBrains http://kotlinlang.org/ The Drive to Develop From andrey.breslav at jetbrains.com Mon Feb 1 05:44:59 2016 From: andrey.breslav at jetbrains.com (Andrey Breslav) Date: Mon, 01 Feb 2016 05:44:59 +0000 Subject: Model 3 classfile design document In-Reply-To: References: <56A25E41.6040508@oracle.com> Message-ID: Another question on the "Generic Methods" section: does the proposed encoding mean a new class file per generic method? Or is it purely about "fake" class-like entries in the CP of the enclosing class? On Mon, Feb 1, 2016 at 8:33 AM Andrey Breslav wrote: > A question about these examples: > > - R(Foo) = Class["Foo"] or ParameterizedType['L', "Foo", "_"] > - R(Foo) = Class["Foo"] orParameterizedType['L', "Foo", "_"] > - R(Foo) =ParameterizedType['L', "Foo", ArrayType[1, "I"]] > > Apparently, we want to preserve the information about int[], while we > don't care about String. Why? Isn't int[] just a class, like String? > > On Fri, Jan 22, 2016 at 7:53 PM Brian Goetz > wrote: > >> Please find a document here: >> >> >> http://cr.openjdk.java.net/~briangoetz/valhalla/eg-attachments/model3-01.html >> >> that describes our current thinking for evolving the classfile format to >> clearly and efficiently represent parametric polymorphism. The early >> concepts of this approach were outlined in my talk at JVMLS last year; >> this represents a refinement of those ideas, and a reasonable "stake in >> the ground" description of what seems the most sensible way to balance >> preserving parametric information in the classfile without imposing >> excessive runtime costs for loading specializations. >> >> We're working on an updated compiler prototype which people will be able >> to play with soon (along with a formal model.) >> >> Please ask questions! >> >> Some things this document does not address yet: >> - How we deal with types implicit in the bytecodes (aload vs iload) >> and how they get specialized; >> - How we represent restricted methods in the classfile; >> - How we represent the wildcard type Foo >> >> >> -- > Andrey Breslav > Project Lead of Kotlin > JetBrains > http://kotlinlang.org/ > The Drive to Develop > -- Andrey Breslav Project Lead of Kotlin JetBrains http://kotlinlang.org/ The Drive to Develop From andrey.breslav at jetbrains.com Mon Feb 1 06:29:34 2016 From: andrey.breslav at jetbrains.com (Andrey Breslav) Date: Mon, 01 Feb 2016 06:29:34 +0000 Subject: Nestmates In-Reply-To: <569FE666.9070103@oracle.com> References: <569FE666.9070103@oracle.com> Message-ID: First, I think it might make sense to share the use cases we have for nestmates in Kotlin (those that will work out): - same as Java: nested/inner classes - multi-file classes (this is how we emulate free functions that are logically direct members of a package in Kotlin) Now, there's one issue that does not seem entirely clear to me: does this proposal imply making nested classes truly private? It does not mention allowing ACC_PRIVATE on classes, so I'm not sure whether this was intended. In any case it would make sense, I think. I haven't given it much thought yet, but we could probably legalize the ACC_PRIVATE flag on classes that have a NestChild entry, and check that they are only accessed from their nestmates, right? (This would only work with Java's every-which-way treatment of access between the nested classes: in Kotlin, for example, nested classes can not access private members of their enclosing class, but such extra restrictions don't seem to be a security concern, because all these classes are in the same compilation unit anyway.) -- Andrey Breslav Project Lead of Kotlin JetBrains http://kotlinlang.org/ The Drive to Develop From andrey.breslav at jetbrains.com Mon Feb 1 06:35:58 2016 From: andrey.breslav at jetbrains.com (Andrey Breslav) Date: Mon, 01 Feb 2016 06:35:58 +0000 Subject: Nestmates In-Reply-To: References: <569FE666.9070103@oracle.com> Message-ID: I think, this thread shows that JVM-language tend to struggle with visibilities one way or another. Even pre-Valhalla Java does, with nested classes. I understand that stretching the nestmate concept beyond what it's meant for is not likely to happen, but for the record and for us to be aware of the possible benefits in case we will be reconsidering some aspects of it at some point I'll describe another use case that we have in Kotlin briefly. Kotlin has the notion of "internal" visibility (visible inside the module, which is a group of source files + declared dependencies), the granular, per-member, form of which is apparently gone from Jigsaw. What could help us there would be, for example - having a class belong to multiple nests (still handshake-based, each nest identified by a canonical member); - opening a particular member only to some of the nests (one would suffice for our use case, but that would mean creating synthetic classes just to designate a new nest). As I gather from the previous messages in this thread, this looks too complex at the moment, so just saying :) On Fri, Jan 22, 2016 at 2:11 AM Vlad Ureche wrote: > Thank you for sharing these insights Brian! > > I think I understand the problem and the solution, but let me ask three > questions to make sure I understood well: > > 1) The NestTop attribute must contain the child classes (except > specializations and lambdas, which are added dynamically), right? Is this > for security, so another class could not pose as a NestChild to access > private data? What about allowing the NestTop attribute to say "anyone who > wants to nest here is welcome to do so"? > 2) Why did you choose to have symmetry and transitivity? I understand that > having an equivalence relation allows partitioning, but it's not clear to > me why partitioning is important in this case. > 3) Why is the NestChild limited to a single top class? > > These questions stem from pondering whether we can use the nestmates > mechanism to implement Scala's enclosing-entity-private access specifiers > (e.g. a variable in class List can be private[scala.collection.List] or > private[scala.collection] or private[scala])... Still, I don't think this > can be done at the granularity required by Scala, so we'll continue to have > name-mangled accessors where necessary :( > > Thanks, > Vlad > -- Andrey Breslav Project Lead of Kotlin JetBrains http://kotlinlang.org/ The Drive to Develop From brian.goetz at oracle.com Mon Feb 1 07:52:23 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 1 Feb 2016 07:52:23 +0000 Subject: Nestmates In-Reply-To: References: <569FE666.9070103@oracle.com> Message-ID: <11943D06-88BF-45BE-B5A1-1DE201331E29@oracle.com> > I think, this thread shows that JVM-language tend to struggle with visibilities one way or another. Even pre-Valhalla Java does, with nested classes. Indeed, this has been an irritant for a long time. Not enough of an irritant to do anything about, until now ? but now that we are pushing much harder on the boundaries of ?class?, something that becomes more urgent. > First, I think it might make sense to share the use cases we have for nestmates in Kotlin (those that will work out): > - same as Java: nested/inner classes > - multi-file classes (this is how we emulate free functions that are logically direct members of a package in Kotlin) Nestmates should handle these cases. Essentially, we are redefining the logical ?class? boundary to span multiple physical classes. > Now, there's one issue that does not seem entirely clear to me: does this proposal imply making nested classes truly private? It does not mention allowing ACC_PRIVATE on classes, so I'm not sure whether this was intended. > In any case it would make sense, I think. I haven't given it much thought yet, but we could probably legalize the ACC_PRIVATE flag on classes that have a NestChild entry, and check that they are only accessed from their nestmates, right? We haven't thought about this too deeply, but it does seem within the spirit of the proposal. > (This would only work with Java's every-which-way treatment of access between the nested classes: in Kotlin, for example, nested classes can not access private members of their enclosing class, but such extra restrictions don't seem to be a security concern, because all these classes are in the same compilation unit anyway.) Right. For example, the rules in Java for protected are somewhat more restricted than for private, and the proposed VM rules are more permissive, but the language compiler can always enforce stricter rules. From brian.goetz at oracle.com Mon Feb 1 21:18:04 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 1 Feb 2016 21:18:04 +0000 Subject: Model 3 classfile design document In-Reply-To: References: <56A25E41.6040508@oracle.com> Message-ID: <0ECABE51-A32F-474D-BF59-A9B284E550AB@oracle.com> In the current translation proposal, it is a real class. That said, the generic method translation is the weakest part of the current proposal, in that it is more similar to Model 1 than Model 3. In particular, the dispatch characteristics for instance generic methods are not currently very attractive ? but we are actively looking for something better. But in the meantime, we have something that works acceptably for prototyping. On Feb 1, 2016, at 5:44 AM, Andrey Breslav wrote: > Another question on the "Generic Methods" section: does the proposed encoding mean a new class file per generic method? Or is it purely about "fake" class-like entries in the CP of the enclosing class? > > On Mon, Feb 1, 2016 at 8:33 AM Andrey Breslav wrote: > A question about these examples: > R(Foo) = Class["Foo"] or ParameterizedType['L', "Foo", "_"] > R(Foo) = Class["Foo"] orParameterizedType['L', "Foo", "_"] > R(Foo) =ParameterizedType['L', "Foo", ArrayType[1, "I"]] > Apparently, we want to preserve the information about int[], while we don't care about String. Why? Isn't int[] just a class, like String? > > On Fri, Jan 22, 2016 at 7:53 PM Brian Goetz wrote: > Please find a document here: > > http://cr.openjdk.java.net/~briangoetz/valhalla/eg-attachments/model3-01.html > > that describes our current thinking for evolving the classfile format to > clearly and efficiently represent parametric polymorphism. The early > concepts of this approach were outlined in my talk at JVMLS last year; > this represents a refinement of those ideas, and a reasonable "stake in > the ground" description of what seems the most sensible way to balance > preserving parametric information in the classfile without imposing > excessive runtime costs for loading specializations. > > We're working on an updated compiler prototype which people will be able > to play with soon (along with a formal model.) > > Please ask questions! > > Some things this document does not address yet: > - How we deal with types implicit in the bytecodes (aload vs iload) > and how they get specialized; > - How we represent restricted methods in the classfile; > - How we represent the wildcard type Foo > > > -- > Andrey Breslav > Project Lead of Kotlin > JetBrains > http://kotlinlang.org/ > The Drive to Develop > -- > Andrey Breslav > Project Lead of Kotlin > JetBrains > http://kotlinlang.org/ > The Drive to Develop From brian.goetz at oracle.com Mon Feb 1 21:18:06 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 1 Feb 2016 21:18:06 +0000 Subject: Model 3 classfile design document In-Reply-To: References: <56A25E41.6040508@oracle.com> Message-ID: <96935CF8-CC41-4283-886E-72450D92DE1A@oracle.com> In the current translation proposal, it is a real class. That said, the generic method translation is the weakest part of the current proposal, in that it is more similar to Model 1 than Model 3. In particular, the dispatch characteristics for instance generic methods are not currently very attractive ? but we are actively looking for something better. But in the meantime, we have something that works acceptably for prototyping. On Feb 1, 2016, at 5:44 AM, Andrey Breslav wrote: > Another question on the "Generic Methods" section: does the proposed encoding mean a new class file per generic method? Or is it purely about "fake" class-like entries in the CP of the enclosing class? > > On Mon, Feb 1, 2016 at 8:33 AM Andrey Breslav wrote: > A question about these examples: > R(Foo) = Class["Foo"] or ParameterizedType['L', "Foo", "_"] > R(Foo) = Class["Foo"] orParameterizedType['L', "Foo", "_"] > R(Foo) =ParameterizedType['L', "Foo", ArrayType[1, "I"]] > Apparently, we want to preserve the information about int[], while we don't care about String. Why? Isn't int[] just a class, like String? > > On Fri, Jan 22, 2016 at 7:53 PM Brian Goetz wrote: > Please find a document here: > > http://cr.openjdk.java.net/~briangoetz/valhalla/eg-attachments/model3-01.html > > that describes our current thinking for evolving the classfile format to > clearly and efficiently represent parametric polymorphism. The early > concepts of this approach were outlined in my talk at JVMLS last year; > this represents a refinement of those ideas, and a reasonable "stake in > the ground" description of what seems the most sensible way to balance > preserving parametric information in the classfile without imposing > excessive runtime costs for loading specializations. > > We're working on an updated compiler prototype which people will be able > to play with soon (along with a formal model.) > > Please ask questions! > > Some things this document does not address yet: > - How we deal with types implicit in the bytecodes (aload vs iload) > and how they get specialized; > - How we represent restricted methods in the classfile; > - How we represent the wildcard type Foo > > > -- > Andrey Breslav > Project Lead of Kotlin > JetBrains > http://kotlinlang.org/ > The Drive to Develop > -- > Andrey Breslav > Project Lead of Kotlin > JetBrains > http://kotlinlang.org/ > The Drive to Develop From brian.goetz at oracle.com Tue Feb 2 13:04:18 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 2 Feb 2016 13:04:18 +0000 Subject: Model 3 classfile design document In-Reply-To: References: <56A25E41.6040508@oracle.com> Message-ID: <7DBEECB8-295D-4D0F-82E2-747F557869EA@oracle.com> This is not a small question! (Actually, depending on the interpretation of ParamType[List, String], it?s one of two questions; I?ll answer them both.) What does ParameterizedType[LFoo, String] mean? Could be one of three things. 1. Specialize Foo with T=String; this produces a fully reified Foo. 2. Recognize that String is a ref type, and produce an erased Foo. 3. Recognize that String is a ref type, and produce an erased Foo, but with metadata that allows the types to be recovered through reflection. If your interpretation is #1 (which is what our interpretation is), then your question is: Why not ?just? do reified generics? Alternately, your question might be: why not do #2 or #3, and retain the type information for longer, to expand the range of implementation choices. I?ll answer this one first. We?d like to minimize the intrusion of Java?s generic type system on the JVM. Rules like ?these types are erased, but these types are reified? are choices that should be left to the language compiler. Just because Java decides to erase, doesn?t mean Kotlin should be required to; you should have the choice. And this simplifies the VM implementation too ? the language compiler asks for erasure or reification, and the VM responds accordingly. (I don?t think this is your question, and I suspect you agree with all this.) Another thing you could be asking is: why does the VM need to know about ?erased? at all? And the reason here is fairly simple (if unfortunate); erasure is noncompositional enough that the compiler cannot simply erase early and ask the VM to propagate and substitute thereafter; doing so would lead to incompatible translations, and we take it as a requirement that we be compatible with existing uses of reference generics. It took us a long time to come to such a simple model for how to capture erasure! Which brings us to the question that I think is your real question: given that we now *can* reify generics over references types, why wouldn?t we always do so? There are many reasons, including compatibility, expressibility, and footprint. Compatibility. If we ?just? reified List, then existing code would be neither source- nor binary- compatible. (When .NET switched to reified generics, you had to switch all of your libraries from the old libraries to the new reified libraries.) That?s a non-starter for us; existing uses of generic classes (both clients and subclasses) should be source and binary compatible after the classes are anyfied. (Additionally, plenty of code has assumptions about the result of reflective operations like .getClass() on generics, that could break if we reified all reference parameterizations.) That means that reference instantiations need to continue to be erased. Some may have a hard time with this conclusion. If you dig at this unease, I think the most likely explanation is the assumption that ?well, reified generics are just better!? But this isn?t true ? both erasure and reification have pros and cons. Erasure was not a ?mistake? to be fixed by reification; it is a compromise, and I think a highly pragmatic one. (Some may ask ?could we make reification an option, say at use site (e.g., ?new List?.) We could, but I suspect that having a mix of reified List and erased List coexisting in the same heap would be an endless source of bugs and corner cases.) Expressibility. Our preference for erasure is not simply based on compatibility. Real-world generic code is full of ?dirty tricks? that involve casting through raw; sometimes this is just sloppiness or lack of expertise with generics, but sometimes this is the only practical way to achieve the desired result without incurring massive copying costs. Truly reifying generics would mean that all this code would break and have to be rewritten. Footprint. Erasure means that we can share a single class to represent all instantiations of a type; Map, Map, etc. Having separate types for each of these would involve more class loading, more class metadata,etc. Yes, there are techniques for minimizing this (.NET reifies a parameterization token but erases at code-gen time), but there is some cost. My point is simply, reification is far, far from free, and erasure is not simply a mistake or a hack to be undone at the first opportunity. So, to answer your direct question: the java compiler chooses to represent List as ParamType[List, erased], rather than ParamType[List, String], for the above reasons ? but Kotlin could make the opposite choice (at some Java interop cost.) On Feb 1, 2016, at 5:33 AM, Andrey Breslav wrote: > A question about these examples: > R(Foo) = Class["Foo"] or ParameterizedType['L', "Foo", "_"] > R(Foo) = Class["Foo"] orParameterizedType['L', "Foo", "_"] > R(Foo) =ParameterizedType['L', "Foo", ArrayType[1, "I"]] > Apparently, we want to preserve the information about int[], while we don't care about String. Why? Isn't int[] just a class, like String? > > On Fri, Jan 22, 2016 at 7:53 PM Brian Goetz wrote: > Please find a document here: > > http://cr.openjdk.java.net/~briangoetz/valhalla/eg-attachments/model3-01.html > > that describes our current thinking for evolving the classfile format to > clearly and efficiently represent parametric polymorphism. The early > concepts of this approach were outlined in my talk at JVMLS last year; > this represents a refinement of those ideas, and a reasonable "stake in > the ground" description of what seems the most sensible way to balance > preserving parametric information in the classfile without imposing > excessive runtime costs for loading specializations. > > We're working on an updated compiler prototype which people will be able > to play with soon (along with a formal model.) > > Please ask questions! > > Some things this document does not address yet: > - How we deal with types implicit in the bytecodes (aload vs iload) > and how they get specialized; > - How we represent restricted methods in the classfile; > - How we represent the wildcard type Foo > > > -- > Andrey Breslav > Project Lead of Kotlin > JetBrains > http://kotlinlang.org/ > The Drive to Develop From brian.goetz at oracle.com Tue Feb 2 13:46:00 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 2 Feb 2016 13:46:00 +0000 Subject: Model 3 classfile design document In-Reply-To: References: <56A25E41.6040508@oracle.com> Message-ID: This is not a small question! (Actually, depending on the interpretation of ParamType[List, String], it?s one of two questions; I?ll answer them both.) What does ParameterizedType[LFoo, String] mean? Could be one of three things. 1. Specialize Foo with T=String; this produces a fully reified Foo. 2. Recognize that String is a ref type, and produce an erased Foo. 3. Recognize that String is a ref type, and produce an erased Foo, but with metadata that allows the types to be recovered through reflection. If your interpretation is #1 (which is what our interpretation is), then your question is: Why not ?just? do reified generics? Alternately, your question might be: why not do #2 or #3, and retain the type information for longer, to expand the range of implementation choices. I?ll answer this one first. We?d like to minimize the intrusion of Java?s generic type system on the JVM. Rules like ?these types are erased, but these types are reified? are choices that should be left to the language compiler. Just because Java decides to erase, doesn?t mean Kotlin should be required to; you should have the choice. And this simplifies the VM implementation too ? the language compiler asks for erasure or reification, and the VM responds accordingly. (I don?t think this is your question, and I suspect you agree with all this.) Another thing you could be asking is: why does the VM need to know about ?erased? at all? And the reason here is fairly simple (if unfortunate); erasure is noncompositional enough that the compiler cannot simply erase early and ask the VM to propagate and substitute thereafter; doing so would lead to incompatible translations, and we take it as a requirement that we be compatible with existing uses of reference generics. It took us a long time to come to such a simple model for how to capture erasure! Which brings us to the question that I think is your real question: given that we now *can* reify generics over references types, why wouldn?t we always do so? There are many reasons, including compatibility, expressibility, and footprint. Compatibility. If we ?just? reified List, then existing code would be neither source- nor binary- compatible. (When .NET switched to reified generics, you had to switch all of your libraries from the old libraries to the new reified libraries.) That?s a non-starter for us; existing uses of generic classes (both clients and subclasses) should be source and binary compatible after the classes are anyfied. (Additionally, plenty of code has assumptions about the result of reflective operations like .getClass() on generics, that could break if we reified all reference parameterizations.) That means that reference instantiations need to continue to be erased. Some may have a hard time with this conclusion. If you dig at this unease, I think the most likely explanation is the assumption that ?well, reified generics are just better!? But this isn?t true ? both erasure and reification have pros and cons. Erasure was not a ?mistake? to be fixed by reification; it is a compromise, and I think a highly pragmatic one. (Some may ask ?could we make reification an option, say at use site (e.g., ?new List?.) We could, but I suspect that having a mix of reified List and erased List coexisting in the same heap would be an endless source of bugs and corner cases.) Expressibility. Our preference for erasure is not simply based on compatibility. Real-world generic code is full of ?dirty tricks? that involve casting through raw; sometimes this is just sloppiness or lack of expertise with generics, but sometimes this is the only practical way to achieve the desired result without incurring massive copying costs. Truly reifying generics would mean that all this code would break and have to be rewritten. Footprint. Erasure means that we can share a single class to represent all instantiations of a type; Map, Map, etc. Having separate types for each of these would involve more class loading, more class metadata,etc. Yes, there are techniques for minimizing this (.NET reifies a parameterization token but erases at code-gen time), but there is some cost. My point is simply, reification is far, far from free, and erasure is not simply a mistake or a hack to be undone at the first opportunity. So, to answer your direct question: the java compiler chooses to represent List as ParamType[List, erased], rather than ParamType[List, String], for the above reasons ? but Kotlin could make the opposite choice (at some Java interop cost.) On Feb 1, 2016, at 5:33 AM, Andrey Breslav wrote: > A question about these examples: > R(Foo) = Class["Foo"] or ParameterizedType['L', "Foo", "_"] > R(Foo) = Class["Foo"] orParameterizedType['L', "Foo", "_"] > R(Foo) =ParameterizedType['L', "Foo", ArrayType[1, "I"]] > Apparently, we want to preserve the information about int[], while we don't care about String. Why? Isn't int[] just a class, like String? > > On Fri, Jan 22, 2016 at 7:53 PM Brian Goetz wrote: > Please find a document here: > > http://cr.openjdk.java.net/~briangoetz/valhalla/eg-attachments/model3-01.html > > that describes our current thinking for evolving the classfile format to > clearly and efficiently represent parametric polymorphism. The early > concepts of this approach were outlined in my talk at JVMLS last year; > this represents a refinement of those ideas, and a reasonable "stake in > the ground" description of what seems the most sensible way to balance > preserving parametric information in the classfile without imposing > excessive runtime costs for loading specializations. > > We're working on an updated compiler prototype which people will be able > to play with soon (along with a formal model.) > > Please ask questions! > > Some things this document does not address yet: > - How we deal with types implicit in the bytecodes (aload vs iload) > and how they get specialized; > - How we represent restricted methods in the classfile; > - How we represent the wildcard type Foo > > > -- > Andrey Breslav > Project Lead of Kotlin > JetBrains > http://kotlinlang.org/ > The Drive to Develop From john.r.rose at oracle.com Tue Feb 2 20:35:40 2016 From: john.r.rose at oracle.com (John Rose) Date: Tue, 2 Feb 2016 12:35:40 -0800 Subject: Nestmates In-Reply-To: <11943D06-88BF-45BE-B5A1-1DE201331E29@oracle.com> References: <569FE666.9070103@oracle.com> <11943D06-88BF-45BE-B5A1-1DE201331E29@oracle.com> Message-ID: On Jan 31, 2016, at 11:52 PM, Brian Goetz wrote: > >> First, I think it might make sense to share the use cases we have for nestmates in Kotlin (those that will work out): >> - same as Java: nested/inner classes >> - multi-file classes (this is how we emulate free functions that are logically direct members of a package in Kotlin) > > Nestmates should handle these cases. Essentially, we are redefining the logical ?class? boundary to span multiple physical classes. The term "nest" strongly suggests inner classes, which is the original, 20-year-old motivating use case for extending private barriers beyond single class-files. But the current proposal is more flexible, as Andrey has noticed. Specifically, it is decoupled from the syntax of Java inner/nested classes. This makes it easier to implement correctly, which is appropriate for a run-time access control mechanism. It is appropriate for the JVM to define simple trust areas (package, nest, module, class-loader) without coupling them too closely to language semantics. (ACC_PROTECTED is an exception. It would seem to be difficult to use for anything except Java and Java-like languages.) If it helps to think about it, we can shift the metaphor from "nestmate" to "friend". In that case, the "top class" is (let us say) really the "privacy club" a bunch of "privacy friends" belongs to. (Or "relative/family", etc., etc.) We are talking about the JVM connecting the control privacy with a well-defined partition or equivalence class, as represented (securely) by the proposed bidirectional attribute links. (The set of nest-tops is a section[1].) The advantage of "top class" is it is clear how we intend for the JLS to map to the JVM. The slight disadvantage is it might make the unwary fixate on that one use case. But, hey, a little Java-centricity doesn't hurt here. [1]: https://en.wikipedia.org/wiki/Section_(category_theory) For a multi-file trust unit you might separately generate the file which defines the "privacy club" (or represents the equivalence class) when you generate the multiple files that comprise the privacy group. You might even re-generate the "privacy club" file if you incrementally compile new files for the same trust unit. (This would work best if the control file contains nothing except the NestTop attribute and its associated constants.) The JVM doesn't care about compilation policy as long as the details are settled before the first file of the trust unit is loaded. And of course this mechanism can be useful even for languages which don't support class nesting (or use cases which don't need nesting). I'm thinking of Scala case-classes here, as well as Kotlin sealed classes (if you subtract the nesting constraint), and JVM enforcement of sealed interfaces. > >> Now, there's one issue that does not seem entirely clear to me: does this proposal imply making nested classes truly private? It does not mention allowing ACC_PRIVATE on classes, so I'm not sure whether this was intended. >> In any case it would make sense, I think. I haven't given it much thought yet, but we could probably legalize the ACC_PRIVATE flag on classes that have a NestChild entry, and check that they are only accessed from their nestmates, right? > > We haven't thought about this too deeply, but it does seem within the spirit of the proposal. My take: If we define a rigorous, reliable nest-mate relation in the JVM, we can also start to mark classes ACC_PRIVATE and enforce the narrower access. I think this is desirable, as a way to harden clusters of inner classes within large packages. Similar (though much less important) point for ACC_PROTECTED. Synthetic classes (generated as "helpers" for a particular compilation unit) should also be marked private. For example, if the NestTop file is separated (for whatever reason) from the syntactically top-level class of a class nest, it can be marked both synthetic and private. class C { class D { } private class E { } } ==> classfile C { NestChild(C$NestTop$) } classfile D { NestChild(C$NestTop$) } classfile E { NestChild(C$NestTop$); access_flags(ACC_PRIVATE) } classfile C$NestTop$ { NestTop(C, D, E); access_flags(ACC_PRIVATE+ACC_SYNTHETIC) } > >> (This would only work with Java's every-which-way treatment of access between the nested classes: in Kotlin, for example, nested classes can not access private members of their enclosing class, but such extra restrictions don't seem to be a security concern, because all these classes are in the same compilation unit anyway.) > > Right. For example, the rules in Java for protected are somewhat more restricted than for private, and the proposed VM rules are more permissive, but the language compiler can always enforce stricter rules. The every-which-way aspect is looser than many language-specific rules, but it is not too loose for a useful JVM-level view of enforceable run-time privacy. It is loose enough to serve as a carrier for a range of language-specific conventions, and tight enough to enforce (at run-time, not compile-time) a useful set of security invariants. (Personally, from a language design standpoint, I think enforcing privacy boundaries *within* a single compilation unit makes for useless effort and silly speed-bumps for coders. But that's an esthetic thing Java and Kotlin can agree to differ on.) Should nestmate relations be allowed across package boundaries? And what about modules and class loaders? (Compare C++ friend relations across namespaces.) I think the answer should be "no" at least at first. (We could relax to some form of "yes" later.) The reason is the JVM's model for access control is intended to be easy to understand and to implement correctly. (Put down that hand, ACC_PROTECTED!) Part of the simplicity is that a package (and a module) is a unit of encapsulation that can only be broken by sharing secrets, which (when it happens) is pretty obvious, as it requires explicit code to be written. Allowing a semi-invisible nestmate attribute to create a "wormhole" of access between packages or modules would be a cute tool for creating shared-secret design patterns, but a more explicit one would be just as workable, and easier to observe in source code when it happens. (Language compilers can sugar it up however they want.) For modules, the mechanism of qualified exports gives a large-scale tool to implement this pattern. Capabilities (e.g., lambdas or Lookup or MHs) give more precise tools. There is no need for nestmates to provide yet another tool, at the cost of disturbing the simple nesting relation of nest/private < package/default < module/public. In a nutshell, the JVM class loader should enforce the structural constraint that NestTop and NestChild attributes should only refer to classes with the same package prefix as the containing class. Another fine point: Should a single class file be allowed to have both a NestTop and a NestChild attribute? Answer: No. It is no burden on static compilers to "roll up" the whole list from a complicated hierarchy into one place. (What if there are 10,000 types in a nest? In that case we need to get on the ball and expand constant pool sizes. *But* the fancy expanded constant pool would *only* exist in the nest-top file, so a more ad hoc solution could be created just for the NestTop attribute?if we get there. That would be preferable to allowing a degrees of freedom?nested nests?that would in practice almost never be used.) ? John P.S. Brian, thank you for creating this proposal and circulating it. I've agitated for normalizing nestmate relations in just about every Java release, and am glad to see we have reached the tipping point. And, a separate top-class attribute is far better than where I thought we'd end up: walking the InnerClasses attribute. From brian.goetz at oracle.com Wed Feb 10 16:51:21 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 10 Feb 2016 17:51:21 +0100 Subject: Model 3 classfile design document In-Reply-To: <201602101450.u1AEogAK013503@d01av01.pok.ibm.com> References: <56A25E41.6040508@oracle.com> <201602101450.u1AEogAK013503@d01av01.pok.ibm.com> Message-ID: So, there?s two layers to this: - What can you express in the bytecode; - What will javac emit We want the VM to be as dumb as possible with respect to parameterization and erasure. Therefore, we don?t ask the VM to reason about things like ?ref types are erased?; the language compiler asks for either reification or erasure, and the VM happily complies. So ParamType[List, String] is a reified List, and ParamType[List, erased] is an erased List. Javac will likely never emit reified generics over references, but other languages could. To your examples: 1. Javac will emit ParamType[Foo, erased] 2. Javac will emit ParamType[Foo, int] (reified) 3. Javac will emit ParamType[Foo, erased] (since int[] is a reference type) 4. Javac will emit ParamType[Foo, erased] (same) Did I make a mistake in the doc that suggested otherwise for 3/4? Please correct me! On Feb 10, 2016, at 3:50 PM, Bjorn B Vardal wrote: > I have a question about reifying array types. This is what I understand is the proposed behaviour: > Foo - Reference, so erased > Foo - Primitive, so reified > Foo - In the Model 3 Classfile Design document, this is reified. > Foo - Unclear - erased as reference, or reified as array? > The first two are quite clear, but I'm wondering about 3 and 4. What is the reason for reifying the int[] in the Model 3 document? Considering that both int[] and String are subclasses of Object, can we not erase array types? If we can't erase them, does that apply to reference arrays as well, e.g. String[]? > -- > Bj?rn V?rdal > > > ----- Original message ----- > From: Brian Goetz > Sent by: "valhalla-spec-experts" > To: valhalla-spec-experts at openjdk.java.net > Cc: > Subject: Model 3 classfile design document > Date: Fri, Jan 22, 2016 11:53 AM > > Please find a document here: > > http://cr.openjdk.java.net/~briangoetz/valhalla/eg-attachments/model3-01.html > > that describes our current thinking for evolving the classfile format to > clearly and efficiently represent parametric polymorphism. The early > concepts of this approach were outlined in my talk at JVMLS last year; > this represents a refinement of those ideas, and a reasonable "stake in > the ground" description of what seems the most sensible way to balance > preserving parametric information in the classfile without imposing > excessive runtime costs for loading specializations. > > We're working on an updated compiler prototype which people will be able > to play with soon (along with a formal model.) > > Please ask questions! > > Some things this document does not address yet: > - How we deal with types implicit in the bytecodes (aload vs iload) > and how they get specialized; > - How we represent restricted methods in the classfile; > - How we represent the wildcard type Foo > > > > From brian.goetz at oracle.com Wed Feb 10 20:52:01 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 10 Feb 2016 21:52:01 +0100 Subject: Model 3 classfile design document In-Reply-To: <201602101905.u1AJ5Eq1012118@d01av01.pok.ibm.com> References: , <56A25E41.6040508@oracle.com> <201602101450.u1AEogAK013503@d01av01.pok.ibm.com> <201602101905.u1AJ5Eq1012118@d01av01.pok.ibm.com> Message-ID: <11B524CF-1F86-42B0-BC91-78E1A3230E27@oracle.com> I see, yes, I goofed in those lines. I guess I was eager to explicate ArrayType and didn?t type-check the results?. R(Foo) should be just ParamType[LFoo, _]. On Feb 10, 2016, at 8:05 PM, Bjorn B Vardal wrote: > Thanks! That matches the behaviour I would expect. > > The piece in the document that led me to the different understanding of point 3/4 was point 4 in the examples on page 4: > R(Foo) = ParameterizedType['L', "Foo", ArrayType[1, "I"]] > > According to point 3 and 4 in your answer, this should be: > R(Foo) = ParameterizedType['L', "Foo", "_"] > > And just to confirm, not the following: > R(Foo) = ParameterizedType['L', "Foo", ArrayType[1, "_"]] > > -- > Bj?rn V?rdal > > > ----- Original message ----- > From: Brian Goetz > To: Bjorn B Vardal/Ottawa/IBM at IBMCA > Cc: valhalla-spec-experts at openjdk.java.net > Subject: Re: Model 3 classfile design document > Date: Wed, Feb 10, 2016 11:51 AM > > So, there?s two layers to this: > - What can you express in the bytecode; > - What will javac emit > > We want the VM to be as dumb as possible with respect to parameterization and erasure. Therefore, we don?t ask the VM to reason about things like ?ref types are erased?; the language compiler asks for either reification or erasure, and the VM happily complies. So ParamType[List, String] is a reified List, and ParamType[List, erased] is an erased List. > > Javac will likely never emit reified generics over references, but other languages could. > > To your examples: > > 1. Javac will emit ParamType[Foo, erased] > 2. Javac will emit ParamType[Foo, int] (reified) > 3. Javac will emit ParamType[Foo, erased] (since int[] is a reference type) > 4. Javac will emit ParamType[Foo, erased] (same) > > Did I make a mistake in the doc that suggested otherwise for 3/4? Please correct me! > > On Feb 10, 2016, at 3:50 PM, Bjorn B Vardal wrote: > >> >> I have a question about reifying array types. This is what I understand is the proposed behaviour: >> Foo - Reference, so erased >> Foo - Primitive, so reified >> Foo - In the Model 3 Classfile Design document, this is reified. >> Foo - Unclear - erased as reference, or reified as array? >> The first two are quite clear, but I'm wondering about 3 and 4. What is the reason for reifying the int[] in the Model 3 document? Considering that both int[] and String are subclasses of Object, can we not erase array types? If we can't erase them, does that apply to reference arrays as well, e.g. String[]? >> -- >> Bj?rn V?rdal >> >> >> ----- Original message ----- >> From: Brian Goetz >> Sent by: "valhalla-spec-experts" >> To: valhalla-spec-experts at openjdk.java.net >> Cc: >> Subject: Model 3 classfile design document >> Date: Fri, Jan 22, 2016 11:53 AM >> >> Please find a document here: >> >> http://cr.openjdk.java.net/~briangoetz/valhalla/eg-attachments/model3-01.html >> >> that describes our current thinking for evolving the classfile format to >> clearly and efficiently represent parametric polymorphism. The early >> concepts of this approach were outlined in my talk at JVMLS last year; >> this represents a refinement of those ideas, and a reasonable "stake in >> the ground" description of what seems the most sensible way to balance >> preserving parametric information in the classfile without imposing >> excessive runtime costs for loading specializations. >> >> We're working on an updated compiler prototype which people will be able >> to play with soon (along with a formal model.) >> >> Please ask questions! >> >> Some things this document does not address yet: >> - How we deal with types implicit in the bytecodes (aload vs iload) >> and how they get specialized; >> - How we represent restricted methods in the classfile; >> - How we represent the wildcard type Foo >> >> >> > > > From brian.goetz at oracle.com Fri Feb 12 09:34:22 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 12 Feb 2016 10:34:22 +0100 Subject: Model 3 classfile design document In-Reply-To: <201602112224.u1BMOcBp005571@d01av01.pok.ibm.com> References: <56A25E41.6040508@oracle.com> <201602112224.u1BMOcBp005571@d01av01.pok.ibm.com> Message-ID: <4BF366C7-9E76-4B86-B2BC-B0BC747596B8@oracle.com> > Will the template class be accessible as a java/lang/Class (or equivalent) in its unspecialized form? If so, will any of the type variable information be made available through this class? We're wondering how much information needs to be stored in our class objects. For compatibility reasons, Constant_CLASS[Foo] and Constant_ParamType[Foo, erased*] resolve to the same class ? legacy classfiles will ask for erased Foo as a Class. Reflection will have to be able to recover whatever information was present in the parameterization. For reified parameterizations, this should be easy enough; for erased parameterizations, reflection will answer ?erased? when asked about ?what is E in ArrayList?. But, it seems safe to assume that, for a class with a GenericClass attribute for non-reified parameters, the answer will be ?erased?. > > The document uses the example where an Outer class' type variable is propagated into the Inner class' GenericClass attribute. Yes. A non-static inner class of a generic class is generic, even if it has no type parameters of its own. (The enclosing type parameters will be referenced at least from the synthetic constructor, which takes an instance of the enclosing class.) > In this case: > > class Outer { > class Inner { > } > } > > where Inner doesn't declare any type variables, my understanding is that Inner will still have the GenericClass attribute because it may refer to T. Will Inner still appear as the first class frame, with tvarCount=0, enforcing the rule that the first element is always the class itself? Yes, that seems the most sensible way to do it. From brian.goetz at oracle.com Sat Feb 13 15:24:09 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 13 Feb 2016 10:24:09 -0500 Subject: Nestmates In-Reply-To: <201602122201.u1CM1BYo011221@d01av05.pok.ibm.com> References: <569FE666.9070103@oracle.com> <201602122201.u1CM1BYo011221@d01av05.pok.ibm.com> Message-ID: <56BF4A99.5020404@oracle.com> On 2/12/2016 5:04 PM, Bjorn B Vardal wrote: > > 1. The Top<->Child handshake only needs to happen when the Child is > loaded (which will load Top as a dependency), and access request > from Child1 to Child2 is reduced to Child1->nestTop == > Child2->nestTop. This means that we can fail immediately if the > handshake fails during class loading, i.e. it should not be > postponed until a private access request fails. Do you agree? > I think we have some options here: - We could fail fast, rejecting the class. - We could simply load the class into a new nest containing only itself; access control (in both directions) that would depend on nestmate-ness would fail later. I think the choice depends on whether we expect to see failures here solely because of attacks / broken compilers, or whether we can imagine reasonable situations where such a condition could happen through separate compilation. > 1. > 2. The proposal assumes that nest mates are always derived from the > same source file. This can be enforced by the Java compiler, but > is it verifiable by the JVM? Both the source file attributes and > class name can be set to whatever we want, which makes it > undesirable for verification purposes. The question really has two > sides: > 1. Do nest mates have to be from the same source file? > 2. If so, how do we verify it? > In Java, this will likely be true, but I can imagine how other languages would use this to assemble a nest from multiple separate files. So I don't think we need to claim they must come from the same file, nor enforce it-- we only need enforce the integrity of the NestXxx attributes. > 1. > 1. > 2. Building on question 2, the solution appears to be that nest mates > must be loaded by the same class loader. If not, someone can load > their own class with the same name as a class from some nest, > using a child class loader, which will pass the handshake, > effectively giving the custom class complete access to that nest. > Yes. Same loader, same package, same module, same protection domain. These all seem reasonable constraints here. > -- > Bj?rn V?rdal > > ----- Original message ----- > From: Brian Goetz > Sent by: "valhalla-spec-experts" > > To: valhalla-spec-experts at openjdk.java.net > Cc: > Subject: Nestmates > Date: Wed, Jan 20, 2016 2:57 PM > This topic is at the complete opposite end of the spectrum from topics > we've been discussing so far. It's mostly an implementation > story, and > of particular interest to the compiler and VM implementers here. > > > Background > ---------- > > Since Java 1.1, the rules for accessibility when inner classes are > involved at the language level are not fully aligned with those at the > VM level. In particular, private and protected access from and to > inner > classes is stricter in the VM than in the language, meaning that in > these cases, the static compiler emits an access bridge (access$000) > which effectively downgrades the accessed member's accessibility to > package. > > Access bridges have some disadvantages. They're ugly, but that's > not a > really big deal. They're imprecise; they allow wider-than-necessary > access to the member. Again, this is not a huge deal on its own. But > the real problem is the complexity of the compiler implementation when > we add generic specialization to the story. > > Specialization adds a new category of cross-class accesses that are > allowed at the language level but not at the VM level, which would > dramatically increase the need for, and complexity of, accessibility > bridges. For example: > > class Foo { > private T t; > > void m(Foo foo) { > int i = foo.t; > } > } > > Now we execute: > > Foo fl = ... > Foo fi = ... > fl.m(fi) > > The spirit of the language rules clearly allow the access from > Foo > to Foo.t -- they are in the "same class". But at the VM level, > Foo and Foo are different classes, so the access from > Foo to a private member of Foo is disallowed. > > One reason that this increases the complexity, and not just the > number, > of accessibility bridges is that bridges are (currently) static > methods; > if they represent instance methods, we pass the receiver as the first > argument. For access between inner classes, this is fine, but when it > comes to access between specializations, this breeds new complexity -- > because the method signature of the accessor needs to be specialized > based on the type parameters of the receiver. This interaction means > the current static-accessor solution would need its own special, > ad-hoc > treatment in specialization, adding to the complexity of > specialization. > > More generally, this situation arises in any case where a single > logical > unit of encapsulation at the source level is split into multiple > runtime > classes (inner classes, specialization classes, synthetic helper > classes.) We propose to address this problem more generally, by > providing a mechanism where language compilers can indicate that > multiple runtime classes live in the same unit of encapsulation. > We do > so by (a) adding metadata to classes to indicate which classes > belong in > the same encapsulation unit and (b) relaxing some VM accessibility > rules > to bring them more in alignment with the language level rules. > > > Overview > -------- > > Our proposed strategy is to reify the relationship between classes > that > are members of the same _nest_. Nestmate-ness can then be > considered in > access control decisions (JVMS 5.4.4). > > Classes that derive from a common source class form a _nest_, and two > classes in the same nest are called _nestmates_. Nestmate-ness is an > equivalence relation (reflexive, symmetric, and transitive.) > Nestmates > of a class C include C's inner classes, synthetic classes generated as > part of translating C, and specializations thereof. > > Since nestmate-ness is an equivalence relation, it forms a partition > over classes, and we can nominate a canonical member for each > partition. > We nominate the "top" (outermost lexically enclosing) class in the > nest as the canonical member; this is the top-level source class from > which all other nestmates derive. > > This makes it easy to calculate nestmate-ness for two classes C > and D; C > and D are nestmates if their "top" class is the same. > > Example > ------- > > class Top { > class A { } > class B { } > } > > void genericMethod() { } > } > > When we compile this, we get: > Top.class // Top > Top$A.class // Inner class Top.A > Top$A$B.class // Inner class Top.A.B > Top$Any.class // Wildcard interface for Top > Top$A$Any.class // Wildcard interface for Top.A > Top$genericMethod.class // Holder class for generic method > > The explicit classes Top, Top.A, and Top.A.B, the synthetic $Any > classes, and the synthetic holder class for genericMethod, along with > all of their specializations, form a nest. The top member of this > nest > is Top. > > Since nestmates all derive from a common top-level class, they are by > definition in the same package and module. A class can be in only one > nest at once. > > > Runtime Representation > ---------------------- > > We represent nestmate-ness with two new attributes -- one in the top > member, which describes all the members of the nest, and one in each > member, which requests access to the nest. > > NestTop { > u2 name_index; > u4 length; > u2 child_count; > u2 childClazz[child_count]; > } > > NestChild { > u2 name_index; > u4 length; > u2 topClazz; > } > > If a class has a NestTop attribute, its nest top is itself. If a class > has a NestChild attribute, its nest top is the class named via > topClazz. > If a class is a specialization of another class, its nest top is the > nest top of the class for which it is a specialization. > > When loading a class with a NestChild attribute, the VM can verify > that > the requested nest permits it as a member, and reject the class if the > child and top do not agree. > > The NestTop attribute can enumerate all inner classes and synthetic > classes, but cannot enumerate all specializations thereof. When > creating > a specialization of a class, the VM records the specialization as > being > a member of whatever nest the template class was a member of. > > > Semantics > --------- > > The accessibility rules here are strictly additions; nestmate-ness > creates additional accessibility over and above the existing rules. > > Informally: > - A class can access the private members of its nestmates; > - A class can access protected members inherited by its nestmates. > > This is slightly broader than the language semantics (but still less > broad than what we do today with access bridges.) The static compiler > can continue to enforce the same rules, and the VM will allow these > accesses without bridges. (We could make the proposal match the > language semantics more closely at the cost of additional complexity, > but its not clear this is worthwhile.) > > For private access, we can add the following to 5.4.4: > - A class C may access a private member D.R if C and D are > nestmates. > > The rules for protected members are more complicated. 5.4.3.{2,3} > first > resolve the true owner of the member, and feed that to 5.4.4; this > process throws away some needed information. We would augment > 5.4.3.{2,3} as follows: > - When performing member resolution from class C on member D.R, we > remember both D (the target class) and E (the resolved class) and make > them both available to 5.4.4. > > We then adjust 5.4.4 accordingly, by adding: > - If R is protected, and C and D are nestmates, and E is > accessible to > D, then access is allowed. > > > Examples > -------- > > For private fields, we generate access bridges whenever an inner class > accesses a private member (field or method) of the enclosing class, or > of another inner class in the same nest. > > In the classes below, the accesses shown are all permitted by the > language spec (child to parent, sibling to sibling, sibling to > child of > sibling, etc), and the ones requiring access bridges are noted. > > class Foo { > public static Foo aFoo; > public static Inner1 aInner1; > public static Inner1.Inner2 aInner2; > public static Inner3 aInner3; > > private int foo; > > class Inner1 { > private int inner1; > > class Inner2 { > private int inner2; > } > > void m() { > int i = aFoo.foo // bridge > + aInner1.inner1 > + aInner2.inner2 // bridge > + aInner3.inner3; // bridge > } > } > > class Inner3 { > private int inner3; > > void m() { > int i = aFoo.foo // bridge > + aInner1.inner1 // bridge > + aInner2.inner2 // bridge > + aInner3.inner3; > } > } > } > > For protected members, the situation is more subtle. > > /* package p1 */ > public class Sup { > protected int pro; > } > > /* package p2 */ > public class Sub extends p1.Sup { > void test() { > ... pro ... //no bridge (invokespecial) > } > > class Inner { > void test() { > ... sub.pro ... // bridge generated in Sub > } > } > } > > Here, the VM rules allow Sub to access protected members of Sup, > but for > accesses from Sub.Inner or Sibling to Sub.pro to succeed, Sub provides > an access bridge (which effectively makes Sub.pro package-visible > throughout package p2.) > > The rules outlined eliminate access bridges in all of these cases. > > > Interaction with defineAnonymousClass > ------------------------------------- > > Nestmate-ness also potentially connects nicely with > Unsafe.defineAnonymousClass. The intuitive notion of dAC is, when you > load anonymous class C with a host class of H, that C is being > "injected > into" H -- access control decisions for C are made using H's > credentials. With a formal notion of nestmateness, we can bring > additional predictability to dAC by saying that C is injected into H's > nest. > > From brian.goetz at oracle.com Mon Feb 15 18:11:16 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Feb 2016 13:11:16 -0500 Subject: Classes, specializations, and statics Message-ID: M3 leaves us in a position to check off one of the outstanding issues, which is that of specialization-specific statics. Members of Java classes have historically been divided into static and instance members; static members are associated with a class, and instance members with an instance of a class. When Java 5 extended the type system to support multiple TYPES that are represented by the same CLASS (e.g., ArrayList and ArrayList), we had a choice; treat static members as belonging to the CLASS, or as belonging to the TYPE. We chose the former, as that was consistent with the translation strategy of erasure, and also maximized compatibility with existing code. (When .NET did reified generics, they chose the opposite; Foo.staticMember and Foo.staticMember refer different variables. That?s also a valid choice.) The following program would be sensitive to this distinction: class Foo { static int count; public Foo() { ++count; } } new Foo(); new Foo(); In Java, this program would increment the common counter twice; under the alternative interpretation, the counters Foo.count and Foo.count would each be incremented once. Statics in Java work the way they do, and we?re not proposing we change that. However, once we break the assumption that all instantiations of a parameterized type are reference instantiations, we run into some issues with existing code idioms. What follows is a proposed generalization of static members in the spirit of Model 3. The Problem ----------- Java code frequently uses tricks like the following, that exploit the assumption of erasure: // Cached instance of an empty collection private static final Collection c = new EmptyCollection(); // Factory method that dispenses the cached empty collections, suitably casted public static Collection emptyCollection() { return (Collection) c; } The above trick works because of erasure; a Collection has the same representation as a Collection, Collection, etc, so we can freely cast it about with no loss of type safety. But once we anyfy emptyCollection(), we?re now hosed; we can?t cast a Collection to a Collection. This leaves us without a means of coding this common idiom, because static members currently are per-class, not per-instantiation. Obviously, the above code must continue to mean what it means today. But we?d also like a means of extending the above idiom more broadly than erased generics. Extending Statics to Specializations ------------------------------------ Our current model treats parameterizations of template classes like classes; anywhere in the bytecode that one can refer to a Constant_Class, one can refer to a Constant_ParameterizedType. (Whether they are actually classes, or more like ?species?, is an open question, but whatever they are, there?s a way to write their name in the classfile.) The existing prototype places static members of Foo on the erased species Class[Foo], and translates access to static member m of Foo as: xxxstatic Class[Foo].m However, we are free to assign meaning to xxxstatic as applied to a member reference whose owner is a parameterized type. Suppose we extend the current set of member ownerships from { instance, static } to { instance, static, specialization }. We could then access a per-specialization member using xxxstatic on a member reference whose owner is a specialization. The syntactic story is mostly a bikeshed; we?ll need some token to indicate ?per-specialization?; we?ll use the silly token __SpecializationStatic for now. The access story is simple: static members continue to only be able to reference other static members (and not class type variables); __SS members can access static members and other __SS members, as well as class type variables; instance members can reference static, __SS, and instance members. The translation / classfile story is simple. Assume we have a spare flag bit (we can synthesize one) for ACC_SPECIALIZATION_STATIC (ACC_SS for short.) Static members are marked with ACC_STATIC; __SS members are marked with ACC_SS. Accesses to static members continue to be translated as xxxstatic Class[Foo].m; accesses to __SS members are translated as xxxstatic ParamType[Foo,params].m. The specialization / runtime story is simple. Static members are treated as if they are restricted to the erased species (this is a natural choice, since Class[Foo] and ParamType[Foo, erased] describe the same class.) __SS members become static members on each parameterization. (Both of these are one-line changes to the existing specializer prototype.) TypeVar constants used in the signature / bodies of __SS members are specialized as usual, and just work. Example: class Collection { private __SS Collection emptyCollection = ? // ACC_SS field emptyCollection : ParamType[Collection, TypeVar[T]] private __SS Collection emptyCollection() { return emptyCollection; } ACC_SS emptyCollection()ParamType[Collection, TypeVar[T]] { getstatic ParamType[Collection, TypeVar[T]].emptyCollection : ParamType[Collection, TypeVar[T]]] areturn } When we specialize Collection, the field type, method return type, etc, will all collapse to Collection by the existing mechanisms. From duncan.macgregor at ge.com Mon Feb 15 18:52:03 2016 From: duncan.macgregor at ge.com (MacGregor, Duncan (GE Energy Connections)) Date: Mon, 15 Feb 2016 18:52:03 +0000 Subject: Classes, specializations, and statics In-Reply-To: References: Message-ID: Thanks for posting this Brian, it feels like a good solution that does not require that specialised statics are tied to the type variable in any specific way. I presume this means there?s likely to be an __SS equivalent to ?? to handle the initialisation of the __SS members? The only pitfall I can see is the accidental initialisation of statics in that __SS ??, and I guess that can be covered by suitable compiler warnings. Regards, Duncan. On 15/02/2016, 18:11, "valhalla-spec-observers on behalf of Brian Goetz" wrote: >M3 leaves us in a position to check off one of the outstanding issues, >which is that of specialization-specific statics. > >Members of Java classes have historically been divided into static and >instance members; static members are associated with a class, and >instance members with an instance of a class. > >When Java 5 extended the type system to support multiple TYPES that are >represented by the same CLASS (e.g., ArrayList and >ArrayList), we had a choice; treat static members as belonging to >the CLASS, or as belonging to the TYPE. We chose the former, as that was >consistent with the translation strategy of erasure, and also maximized >compatibility with existing code. (When .NET did reified generics, they >chose the opposite; Foo.staticMember and Foo.staticMember >refer different variables. That?s also a valid choice.) > >The following program would be sensitive to this distinction: > >class Foo { > static int count; > > public Foo() { ++count; } >} > >new Foo(); >new Foo(); > >In Java, this program would increment the common counter twice; under the >alternative interpretation, the counters Foo.count and >Foo.count would each be incremented once. > >Statics in Java work the way they do, and we?re not proposing we change >that. However, once we break the assumption that all instantiations of a >parameterized type are reference instantiations, we run into some issues >with existing code idioms. What follows is a proposed generalization of >static members in the spirit of Model 3. > > >The Problem >----------- > >Java code frequently uses tricks like the following, that exploit the >assumption of erasure: > >// Cached instance of an empty collection >private static final Collection c = new EmptyCollection(); > >// Factory method that dispenses the cached empty collections, suitably >casted >public static Collection emptyCollection() { return (Collection) >c; } > >The above trick works because of erasure; a Collection has the same >representation as a Collection, Collection, etc, so we >can freely cast it about with no loss of type safety. But once we anyfy >emptyCollection(), we?re now hosed; we can?t cast a Collection to a >Collection. This leaves us without a means of coding this common >idiom, because static members currently are per-class, not >per-instantiation. > >Obviously, the above code must continue to mean what it means today. But >we?d also like a means of extending the above idiom more broadly than >erased generics. > > >Extending Statics to Specializations >------------------------------------ > >Our current model treats parameterizations of template classes like >classes; anywhere in the bytecode that one can refer to a Constant_Class, >one can refer to a Constant_ParameterizedType. (Whether they are >actually classes, or more like ?species?, is an open question, but >whatever they are, there?s a way to write their name in the classfile.) > >The existing prototype places static members of Foo on the erased >species Class[Foo], and translates access to static member m of FooT> as: > > xxxstatic Class[Foo].m > >However, we are free to assign meaning to xxxstatic as applied to a >member reference whose owner is a parameterized type. Suppose we extend >the current set of member ownerships from { instance, static } to { >instance, static, specialization }. We could then access a >per-specialization member using xxxstatic on a member reference whose >owner is a specialization. > >The syntactic story is mostly a bikeshed; we?ll need some token to >indicate ?per-specialization?; we?ll use the silly token >__SpecializationStatic for now. > >The access story is simple: static members continue to only be able to >reference other static members (and not class type variables); __SS >members can access static members and other __SS members, as well as >class type variables; instance members can reference static, __SS, and >instance members. > >The translation / classfile story is simple. Assume we have a spare flag >bit (we can synthesize one) for ACC_SPECIALIZATION_STATIC (ACC_SS for >short.) Static members are marked with ACC_STATIC; __SS members are >marked with ACC_SS. Accesses to static members continue to be translated >as xxxstatic Class[Foo].m; accesses to __SS members are translated as >xxxstatic ParamType[Foo,params].m. > >The specialization / runtime story is simple. Static members are treated >as if they are restricted to the erased species (this is a natural >choice, since Class[Foo] and ParamType[Foo, erased] describe the same >class.) __SS members become static members on each parameterization. >(Both of these are one-line changes to the existing specializer >prototype.) TypeVar constants used in the signature / bodies of __SS >members are specialized as usual, and just work. > > >Example: > >class Collection { > private __SS Collection emptyCollection = ? > // ACC_SS field emptyCollection : ParamType[Collection, TypeVar[T]] > > private __SS Collection emptyCollection() { return emptyCollection; >} > ACC_SS emptyCollection()ParamType[Collection, TypeVar[T]] { > getstatic ParamType[Collection, TypeVar[T]].emptyCollection : >ParamType[Collection, TypeVar[T]]] > areturn > } > >When we specialize Collection, the field type, method return type, >etc, will all collapse to Collection by the existing mechanisms. > > From palo.marton at gmail.com Mon Feb 15 21:03:52 2016 From: palo.marton at gmail.com (Palo Marton) Date: Mon, 15 Feb 2016 22:03:52 +0100 Subject: Classes, specializations, and statics In-Reply-To: References: Message-ID: Just one note on this issue: You have to deal also with this slightly different code: class SomeOtherClassWithoutTParameter { private static Collection emptyCollection = ? private static Collection emptyCollection() { return emptyCollection; } } E.g. Collections.emptyList() It can be handled by creating some fake parametrized inner class, but may be there is a more elegant solution within your current model. (see also this message: http://mail.openjdk.java.net/pipermail/valhalla-dev/2015-January/000812.html in this old thread from Jan 2015: http://mail.openjdk.java.net/pipermail/valhalla-dev/2015-January/000802.html ) On Mon, Feb 15, 2016 at 7:11 PM, Brian Goetz wrote: > M3 leaves us in a position to check off one of the outstanding issues, > which is that of specialization-specific statics. > > Members of Java classes have historically been divided into static and > instance members; static members are associated with a class, and instance > members with an instance of a class. > > When Java 5 extended the type system to support multiple TYPES that are > represented by the same CLASS (e.g., ArrayList and > ArrayList), we had a choice; treat static members as belonging to > the CLASS, or as belonging to the TYPE. We chose the former, as that was > consistent with the translation strategy of erasure, and also maximized > compatibility with existing code. (When .NET did reified generics, they > chose the opposite; Foo.staticMember and Foo.staticMember > refer different variables. That?s also a valid choice.) > > The following program would be sensitive to this distinction: > > class Foo { > static int count; > > public Foo() { ++count; } > } > > new Foo(); > new Foo(); > > In Java, this program would increment the common counter twice; under the > alternative interpretation, the counters Foo.count and > Foo.count would each be incremented once. > > Statics in Java work the way they do, and we?re not proposing we change > that. However, once we break the assumption that all instantiations of a > parameterized type are reference instantiations, we run into some issues > with existing code idioms. What follows is a proposed generalization of > static members in the spirit of Model 3. > > > The Problem > ----------- > > Java code frequently uses tricks like the following, that exploit the > assumption of erasure: > > // Cached instance of an empty collection > private static final Collection c = new EmptyCollection(); > > // Factory method that dispenses the cached empty collections, suitably > casted > public static Collection emptyCollection() { return (Collection) > c; } > > The above trick works because of erasure; a Collection has the same > representation as a Collection, Collection, etc, so we can > freely cast it about with no loss of type safety. But once we anyfy > emptyCollection(), we?re now hosed; we can?t cast a Collection to a > Collection. This leaves us without a means of coding this common > idiom, because static members currently are per-class, not > per-instantiation. > > Obviously, the above code must continue to mean what it means today. But > we?d also like a means of extending the above idiom more broadly than > erased generics. > > > Extending Statics to Specializations > ------------------------------------ > > Our current model treats parameterizations of template classes like > classes; anywhere in the bytecode that one can refer to a Constant_Class, > one can refer to a Constant_ParameterizedType. (Whether they are actually > classes, or more like ?species?, is an open question, but whatever they > are, there?s a way to write their name in the classfile.) > > The existing prototype places static members of Foo on the erased > species Class[Foo], and translates access to static member m of Foo > as: > > xxxstatic Class[Foo].m > > However, we are free to assign meaning to xxxstatic as applied to a member > reference whose owner is a parameterized type. Suppose we extend the > current set of member ownerships from { instance, static } to { instance, > static, specialization }. We could then access a per-specialization member > using xxxstatic on a member reference whose owner is a specialization. > > The syntactic story is mostly a bikeshed; we?ll need some token to > indicate ?per-specialization?; we?ll use the silly token > __SpecializationStatic for now. > > The access story is simple: static members continue to only be able to > reference other static members (and not class type variables); __SS members > can access static members and other __SS members, as well as class type > variables; instance members can reference static, __SS, and instance > members. > > The translation / classfile story is simple. Assume we have a spare flag > bit (we can synthesize one) for ACC_SPECIALIZATION_STATIC (ACC_SS for > short.) Static members are marked with ACC_STATIC; __SS members are marked > with ACC_SS. Accesses to static members continue to be translated as > xxxstatic Class[Foo].m; accesses to __SS members are translated as > xxxstatic ParamType[Foo,params].m. > > The specialization / runtime story is simple. Static members are treated > as if they are restricted to the erased species (this is a natural choice, > since Class[Foo] and ParamType[Foo, erased] describe the same class.) __SS > members become static members on each parameterization. (Both of these are > one-line changes to the existing specializer prototype.) TypeVar constants > used in the signature / bodies of __SS members are specialized as usual, > and just work. > > > Example: > > class Collection { > private __SS Collection emptyCollection = ? > // ACC_SS field emptyCollection : ParamType[Collection, TypeVar[T]] > > private __SS Collection emptyCollection() { return emptyCollection; } > ACC_SS emptyCollection()ParamType[Collection, TypeVar[T]] { > getstatic ParamType[Collection, TypeVar[T]].emptyCollection : > ParamType[Collection, TypeVar[T]]] > areturn > } > > When we specialize Collection, the field type, method return type, > etc, will all collapse to Collection by the existing mechanisms. > > > From forax at univ-mlv.fr Mon Feb 15 22:03:50 2016 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 15 Feb 2016 23:03:50 +0100 (CET) Subject: Classes, specializations, and statics In-Reply-To: References: Message-ID: <520344619.548286.1455573830067.JavaMail.zimbra@u-pem.fr> The other solution is to allow static field to be parameterized like methods. private static final Collection EMPTY = new EmptyCollection(); public static Collection empty() { return EMPTY; } or perhaps it's just a different syntax of the same solution. R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: valhalla-spec-experts at openjdk.java.net > Envoy?: Lundi 15 F?vrier 2016 19:11:16 > Objet: Classes, specializations, and statics > > M3 leaves us in a position to check off one of the outstanding issues, which > is that of specialization-specific statics. > > Members of Java classes have historically been divided into static and > instance members; static members are associated with a class, and instance > members with an instance of a class. > > When Java 5 extended the type system to support multiple TYPES that are > represented by the same CLASS (e.g., ArrayList and > ArrayList), we had a choice; treat static members as belonging to > the CLASS, or as belonging to the TYPE. We chose the former, as that was > consistent with the translation strategy of erasure, and also maximized > compatibility with existing code. (When .NET did reified generics, they > chose the opposite; Foo.staticMember and Foo.staticMember > refer different variables. That?s also a valid choice.) > > The following program would be sensitive to this distinction: > > class Foo { > static int count; > > public Foo() { ++count; } > } > > new Foo(); > new Foo(); > > In Java, this program would increment the common counter twice; under the > alternative interpretation, the counters Foo.count and > Foo.count would each be incremented once. > > Statics in Java work the way they do, and we?re not proposing we change that. > However, once we break the assumption that all instantiations of a > parameterized type are reference instantiations, we run into some issues > with existing code idioms. What follows is a proposed generalization of > static members in the spirit of Model 3. > > > The Problem > ----------- > > Java code frequently uses tricks like the following, that exploit the > assumption of erasure: > > // Cached instance of an empty collection > private static final Collection c = new EmptyCollection(); > > // Factory method that dispenses the cached empty collections, suitably > casted > public static Collection emptyCollection() { return (Collection) c; > } > > The above trick works because of erasure; a Collection has the same > representation as a Collection, Collection, etc, so we can > freely cast it about with no loss of type safety. But once we anyfy > emptyCollection(), we?re now hosed; we can?t cast a Collection to a > Collection. This leaves us without a means of coding this common > idiom, because static members currently are per-class, not > per-instantiation. > > Obviously, the above code must continue to mean what it means today. But > we?d also like a means of extending the above idiom more broadly than erased > generics. > > > Extending Statics to Specializations > ------------------------------------ > > Our current model treats parameterizations of template classes like classes; > anywhere in the bytecode that one can refer to a Constant_Class, one can > refer to a Constant_ParameterizedType. (Whether they are actually classes, > or more like ?species?, is an open question, but whatever they are, there?s > a way to write their name in the classfile.) > > The existing prototype places static members of Foo on the erased > species Class[Foo], and translates access to static member m of Foo > as: > > xxxstatic Class[Foo].m > > However, we are free to assign meaning to xxxstatic as applied to a member > reference whose owner is a parameterized type. Suppose we extend the > current set of member ownerships from { instance, static } to { instance, > static, specialization }. We could then access a per-specialization member > using xxxstatic on a member reference whose owner is a specialization. > > The syntactic story is mostly a bikeshed; we?ll need some token to indicate > ?per-specialization?; we?ll use the silly token __SpecializationStatic for > now. > > The access story is simple: static members continue to only be able to > reference other static members (and not class type variables); __SS members > can access static members and other __SS members, as well as class type > variables; instance members can reference static, __SS, and instance > members. > > The translation / classfile story is simple. Assume we have a spare flag bit > (we can synthesize one) for ACC_SPECIALIZATION_STATIC (ACC_SS for short.) > Static members are marked with ACC_STATIC; __SS members are marked with > ACC_SS. Accesses to static members continue to be translated as xxxstatic > Class[Foo].m; accesses to __SS members are translated as xxxstatic > ParamType[Foo,params].m. > > The specialization / runtime story is simple. Static members are treated as > if they are restricted to the erased species (this is a natural choice, > since Class[Foo] and ParamType[Foo, erased] describe the same class.) __SS > members become static members on each parameterization. (Both of these are > one-line changes to the existing specializer prototype.) TypeVar constants > used in the signature / bodies of __SS members are specialized as usual, and > just work. > > > Example: > > class Collection { > private __SS Collection emptyCollection = ? > // ACC_SS field emptyCollection : ParamType[Collection, TypeVar[T]] > > private __SS Collection emptyCollection() { return emptyCollection; } > ACC_SS emptyCollection()ParamType[Collection, TypeVar[T]] { > getstatic ParamType[Collection, TypeVar[T]].emptyCollection : > ParamType[Collection, TypeVar[T]]] > areturn > } > > When we specialize Collection, the field type, method return type, etc, > will all collapse to Collection by the existing mechanisms. > > > From ali.ebrahimi1781 at gmail.com Tue Feb 16 08:10:21 2016 From: ali.ebrahimi1781 at gmail.com (Ali Ebrahimi) Date: Tue, 16 Feb 2016 11:40:21 +0330 Subject: Classes, specializations, and statics In-Reply-To: <520344619.548286.1455573830067.JavaMail.zimbra@u-pem.fr> References: <520344619.548286.1455573830067.JavaMail.zimbra@u-pem.fr> Message-ID: Hi, On Tue, Feb 16, 2016 at 1:33 AM, Remi Forax wrote: > The other solution is to allow static field to be parameterized like > methods. > > private static final Collection EMPTY = new > EmptyCollection(); > > public static Collection empty() { > return EMPTY; > } > > or perhaps it's just a different syntax of the same solution. If you mean this: private static final Collection EMPTY = new EmptyCollection<>(); This does not work: Collection c = Collections.empty(); //CCE OR if you mean static field overloading : private static final Collection EMPTY = new EmptyCollection(); equivalents to: private static final Collection EMPTY = new EmptyCollection<>(); private static final Collection EMPTY = new EmptyCollection(); private static final Collection EMPTY = new EmptyCollection(); ... for all val types... I don't know how this can be cleanly handled. One possible way: transfer empty field to EmptyCollection public static Collection empty() { return EmptyCollection.EMPTY; } But this would work if EMPTY is private and not referenced from outside: So we can not move that public static final Collection EMPTY = new EmptyCollection<>(); I think there is enough weapons here: public static final Collection EMPTY = new EmptyCollection<>(); //don't touch public static Collection empty() { return EMPTY; } public static Collection empty() { return EmptyCollection.EMPTY; } -- Best Regards, Ali Ebrahimi From john.r.rose at oracle.com Tue Feb 16 23:30:17 2016 From: john.r.rose at oracle.com (John Rose) Date: Tue, 16 Feb 2016 15:30:17 -0800 Subject: Nestmates In-Reply-To: <56BF4A99.5020404@oracle.com> References: <569FE666.9070103@oracle.com> <201602122201.u1CM1BYo011221@d01av05.pok.ibm.com> <56BF4A99.5020404@oracle.com> Message-ID: <8CA20EFE-BAB6-41B9-9138-B558F253B13F@oracle.com> On Feb 13, 2016, at 7:24 AM, Brian Goetz wrote: > > > On 2/12/2016 5:04 PM, Bjorn B Vardal wrote: >> ? The Top<->Child handshake only needs to happen when the Child is loaded (which will load Top as a dependency), and access request from Child1 to Child2 is reduced to Child1->nestTop == Child2->nestTop. This means that we can fail immediately if the handshake fails during class loading, i.e. it should not be postponed until a private access request fails. Do you agree? > > I think we have some options here: > - We could fail fast, rejecting the class. > - We could simply load the class into a new nest containing only itself; access control (in both directions) that would depend on nestmate-ness would fail later. > > I think the choice depends on whether we expect to see failures here solely because of attacks / broken compilers, or whether we can imagine reasonable situations where such a condition could happen through separate compilation. The decision also influences when the Top class is loaded. If we fail fast, the Top is loaded as a prerequisite to loading the Child (much as loading the Child's supers is a prerequisite). If we fail lazy, then Top can be loaded the first time somebody tries a cross-nest access, but not until then. There are other orderings, but these two are the simplest to implement. I think overall the fail-fast option is the easiest to implement. But it does create a potentially surprising failure mode, which is not analogous to the failure modes for accessing other restricted names (protected, package, module). In those other cases, there is no need to load anything (or fail in the attempt) in order to answer an access control question. I like the idea that access control decisions are made on already-loaded data; it simplifies the model. ? John From john.r.rose at oracle.com Tue Feb 16 23:40:17 2016 From: john.r.rose at oracle.com (John Rose) Date: Tue, 16 Feb 2016 15:40:17 -0800 Subject: Nestmates In-Reply-To: <8CA20EFE-BAB6-41B9-9138-B558F253B13F@oracle.com> References: <569FE666.9070103@oracle.com> <201602122201.u1CM1BYo011221@d01av05.pok.ibm.com> <56BF4A99.5020404@oracle.com> <8CA20EFE-BAB6-41B9-9138-B558F253B13F@oracle.com> Message-ID: P.S. Bottom line: Fail-fast is easier to get right, and therefore is preferable, unless there is (as Brian points out) some use case which benefits from the lazy checks. I think that separately compiled nests are a likely use-case, but it would *not* be a problem if malformed nests failed to load. It would be a faster indication of a bug in the compile-time logic, which had fouled the nest. > On Feb 16, 2016, at 3:30 PM, John Rose wrote: > > On Feb 13, 2016, at 7:24 AM, Brian Goetz > wrote: >> >> >> On 2/12/2016 5:04 PM, Bjorn B Vardal wrote: >>> ? The Top<->Child handshake only needs to happen when the Child is loaded (which will load Top as a dependency), and access request from Child1 to Child2 is reduced to Child1->nestTop == Child2->nestTop. This means that we can fail immediately if the handshake fails during class loading, i.e. it should not be postponed until a private access request fails. Do you agree? >> >> I think we have some options here: >> - We could fail fast, rejecting the class. >> - We could simply load the class into a new nest containing only itself; access control (in both directions) that would depend on nestmate-ness would fail later. >> >> I think the choice depends on whether we expect to see failures here solely because of attacks / broken compilers, or whether we can imagine reasonable situations where such a condition could happen through separate compilation. > > The decision also influences when the Top class is loaded. If we fail fast, the Top is loaded as a prerequisite to loading the Child (much as loading the Child's supers is a prerequisite). If we fail lazy, then Top can be loaded the first time somebody tries a cross-nest access, but not until then. There are other orderings, but these two are the simplest to implement. I think overall the fail-fast option is the easiest to implement. But it does create a potentially surprising failure mode, which is not analogous to the failure modes for accessing other restricted names (protected, package, module). In those other cases, there is no need to load anything (or fail in the attempt) in order to answer an access control question. I like the idea that access control decisions are made on already-loaded data; it simplifies the model. > > ? John From peter.levart at gmail.com Wed Feb 17 17:03:17 2016 From: peter.levart at gmail.com (Peter Levart) Date: Wed, 17 Feb 2016 18:03:17 +0100 Subject: Classes, specializations, and statics In-Reply-To: References: Message-ID: <56C4A7D5.4010801@gmail.com> Hi Brian, On 02/15/2016 07:11 PM, Brian Goetz wrote: > Example: > > class Collection { > private __SS Collection emptyCollection = ? > // ACC_SS field emptyCollection : ParamType[Collection, TypeVar[T]] > > private __SS Collection emptyCollection() { return > emptyCollection; } > ACC_SS emptyCollection()ParamType[Collection, TypeVar[T]] { > getstatic ParamType[Collection, TypeVar[T]].emptyCollection : > ParamType[Collection, TypeVar[T]]] > areturn > } > > When we specialize Collection, the field type, method return > type, etc, will all collapse to Collection by the existing > mechanisms. This would work if the emptyCollection was actually empty and immutable, but could you do the following: class Collection { private __SS Collectioncollection = new ArrayList(); public __SS Collection collection() { return collection; } } And then in code: Collection cs = Collection.collection(); Collection cn = Collection.collection(); cs.add("abc"); Number n = cn.iterator().next(); If cs and cn hold the same instance, we have introduced heap corruption without compilation warnings. So I suppose in language you could only access the _SS members in the following way: Collection.collection(); Collection.collection(); Collection.collection(); Collection.collection(); ... but not: Collection.collection(); Collection.collection(); ... Like .class literals in the prototype. Regards, Peter From brian.goetz at oracle.com Wed Feb 17 17:09:03 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 17 Feb 2016 12:09:03 -0500 Subject: Model 3 classfile design document In-Reply-To: <56A25E41.6040508@oracle.com> References: <56A25E41.6040508@oracle.com> Message-ID: <56C4A92F.8000107@oracle.com> Having discussed the classfile representation and sketched out some plausibility arguments about how the VM can efficiently manage specialization, let's step back and look at the consequences for what this means for the language (both Java and other languages.) Type -> Class mapping. With erased generics, all parameterizations Foo map to a single class Foo. In the Model 3 model, the classfile for Foo is essentially a template; we can request parameterizations of Foo via the ParamType constant (the Class constant Class[Foo] becomes retconned to mean ParamType[Foo, erased].) Reflection. In the current prototype, Foo and Foo are distinct classes; each will respond with distinct .getClass() results. We don't yet have a means to express that Foo and Foo are different "species" of Foo; instead each get their own class mirror. Reflective operations like Foo.class.getName() currently yield ugly results. Lots of open questions here. Reification. The question on everyone's mind will be: are we "finally getting reified generics"? And the answer is: sort of. (This question also comes with a lot of baggage; there are a lot of people who assume that erasure is somehow "smelly" and therefore bad, and so of course reification must be better. But erasure is a pragmatic compromise, and the alternative is not always better. Let's try and leave the baggage at the door for now.) To add to the confusion, not everyone means the same thing by "reified generics". To some, reification means "types are checked at runtime"; to others, it may merely mean "types are reflectively available at runtime." Even within the first category, there's a range of what sort of type checking we might mean, since the VM type system may not be exactly the same type system as the language-level type system -- and for good reason. (What if we ask for a reified ArrayList & Serializable>? Do we get runtime subtype checking for wildcards and intersections every time we try to put something in this List? Would we even want that? Are we sure such checks are decidable?) In Model 3, specialization is clearly a form of reification; when we specialize ArrayList to E=int, the backing store is an int[], and therefore we get all the type checking that entails. We can clearly layer additional support for reflectively exposing the bindings of type parameters in a number of ways. The Model 3 classfile design explicitly admits both reified and erased generics at the VM level, by allowing a concrete type descriptor *or* the 'erased' token as a type parameter to a ParameterizedType. (Note that 'erased' is not a type, it is merely an allowed type parameterization -- similar to wildcards in in the Java language.) There is nothing in the classfile design that encodes the rule "reference parameterizations are erased"; that's the choice of the language compiler. In this way, we can consider any non-erased parameterization to be reified; a ParamType[ArrayList, LString] will throw ArrayStoreException at runtime if you try to cram something other than a String into it. So, does that mean generics are reified? Sort of... For multiple reasons (including, but not exclusively compatibility), the current plan is for the Java language to continue to use erasure for reference parameterizations of generics. But other languages are free to use full reification where it suits them (and if their Java interop requirements let them.) If someone uses reflection to reflect over a List and ask for its type parameter, it will come back as "erased" (reflection has to support this answer anyway, if only for compatibility with legacy code.) So the punchline is, at the Java language, generics are erased *and* reified; generics over references are erased (as they are today) and generics over values are reified. I suspect people will be about as jarred by this as they were by erasure in the first place; I expect we'll get some degree of "You idiots, you ran 99 yards only to fumble the ball on the 1 yard line." But looking past this (which is mostly the above-mentioned baggage), the model seems sound enough; existing reference generics work as they always have, and new value generics work "better" (in that there are additional things you can do with them.) In fact, it gives us a chance to be more honest about erasure, because "erased" can appear as a first-class member of the programming model. I believe much of the complaints about erasure stem from the fact that it is inevitably a surprise when you first discover it. On 1/22/2016 11:52 AM, Brian Goetz wrote: > Please find a document here: > > http://cr.openjdk.java.net/~briangoetz/valhalla/eg-attachments/model3-01.html > > > that describes our current thinking for evolving the classfile format > to clearly and efficiently represent parametric polymorphism. The > early concepts of this approach were outlined in my talk at JVMLS last > year; this represents a refinement of those ideas, and a reasonable > "stake in the ground" description of what seems the most sensible way > to balance preserving parametric information in the classfile without > imposing excessive runtime costs for loading specializations. From brian.goetz at oracle.com Wed Feb 17 17:48:54 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 17 Feb 2016 12:48:54 -0500 Subject: Classes, specializations, and statics In-Reply-To: <56C4A7D5.4010801@gmail.com> References: <56C4A7D5.4010801@gmail.com> Message-ID: <56C4B286.9050200@oracle.com> Nice catch. Yes, this would cause heap pollution. As would the following Java 5 code: private Collection collection = new ArrayList(); public Collection c() { return (Collection) collection; } The trouble is, the unchecked warning is in the library implementation, not the user code. In the case of an immutable collection, we happily suppress the warning knowing that everything is safe. If we were returning a mutable collection, it would be unsafe to suppress the warning, and doing so would be a library bug. We have several ways out of this hole; one is to restrict invocation, as you suggest. Another is to add some unchecked warnings. (Another possible path is to treat signatures of erased __SS members, when accessed from outside, as if they contained capture variables.) On 2/17/2016 12:03 PM, Peter Levart wrote: > Hi Brian, > > On 02/15/2016 07:11 PM, Brian Goetz wrote: >> Example: >> >> class Collection { >> private __SS Collection emptyCollection = ? >> // ACC_SS field emptyCollection : ParamType[Collection, TypeVar[T]] >> >> private __SS Collection emptyCollection() { return >> emptyCollection; } >> ACC_SS emptyCollection()ParamType[Collection, TypeVar[T]] { >> getstatic ParamType[Collection, TypeVar[T]].emptyCollection : >> ParamType[Collection, TypeVar[T]]] >> areturn >> } >> >> When we specialize Collection, the field type, method return >> type, etc, will all collapse to Collection by the existing >> mechanisms. > > This would work if the emptyCollection was actually empty and > immutable, but could you do the following: > > class Collection { > private __SS Collectioncollection = new ArrayList(); > > public __SS Collection collection() { return collection; } > } > > > And then in code: > > Collection cs = Collection.collection(); > Collection cn = Collection.collection(); > > cs.add("abc"); > Number n = cn.iterator().next(); > > If cs and cn hold the same instance, we have introduced heap > corruption without compilation warnings. > > So I suppose in language you could only access the _SS members in the > following way: > > Collection.collection(); > Collection.collection(); > Collection.collection(); > Collection.collection(); > ... > > but not: > > Collection.collection(); > Collection.collection(); > ... > > > Like .class literals in the prototype. > > Regards, Peter > > From palo.marton at gmail.com Wed Feb 17 18:25:16 2016 From: palo.marton at gmail.com (Palo Marton) Date: Wed, 17 Feb 2016 19:25:16 +0100 Subject: Classes, specializations, and statics In-Reply-To: <56C4B286.9050200@oracle.com> References: <56C4A7D5.4010801@gmail.com> <56C4B286.9050200@oracle.com> Message-ID: Another option is dissalow using T in static field declaration, but allow use of T.erased, like this: class Collection { private __SS Collection emptyCollection = ? private __SS Collection emptyCollection() { return (Collection)emptyCollection; // warning } } On Wed, Feb 17, 2016 at 6:48 PM, Brian Goetz wrote: > Nice catch. Yes, this would cause heap pollution. As would the following > Java 5 code: > > private Collection collection = new ArrayList(); > public Collection c() { return (Collection) collection; } > > The trouble is, the unchecked warning is in the library implementation, > not the user code. In the case of an immutable collection, we happily > suppress the warning knowing that everything is safe. If we were returning > a mutable collection, it would be unsafe to suppress the warning, and doing > so would be a library bug. > > We have several ways out of this hole; one is to restrict invocation, as > you suggest. Another is to add some unchecked warnings. (Another possible > path is to treat signatures of erased __SS members, when accessed from > outside, as if they contained capture variables.) > > > > > > On 2/17/2016 12:03 PM, Peter Levart wrote: > >> Hi Brian, >> >> On 02/15/2016 07:11 PM, Brian Goetz wrote: >> >>> Example: >>> >>> class Collection { >>> private __SS Collection emptyCollection = ? >>> // ACC_SS field emptyCollection : ParamType[Collection, TypeVar[T]] >>> >>> private __SS Collection emptyCollection() { return >>> emptyCollection; } >>> ACC_SS emptyCollection()ParamType[Collection, TypeVar[T]] { >>> getstatic ParamType[Collection, TypeVar[T]].emptyCollection : >>> ParamType[Collection, TypeVar[T]]] >>> areturn >>> } >>> >>> When we specialize Collection, the field type, method return type, >>> etc, will all collapse to Collection by the existing mechanisms. >>> >> >> This would work if the emptyCollection was actually empty and immutable, >> but could you do the following: >> >> class Collection { >> private __SS Collectioncollection = new ArrayList(); >> >> public __SS Collection collection() { return collection; } >> } >> >> >> And then in code: >> >> Collection cs = Collection.collection(); >> Collection cn = Collection.collection(); >> >> cs.add("abc"); >> Number n = cn.iterator().next(); >> >> If cs and cn hold the same instance, we have introduced heap corruption >> without compilation warnings. >> >> So I suppose in language you could only access the _SS members in the >> following way: >> >> Collection.collection(); >> Collection.collection(); >> Collection.collection(); >> Collection.collection(); >> ... >> >> but not: >> >> Collection.collection(); >> Collection.collection(); >> ... >> >> >> Like .class literals in the prototype. >> >> Regards, Peter >> >> >> > From peter.levart at gmail.com Wed Feb 17 19:32:05 2016 From: peter.levart at gmail.com (Peter Levart) Date: Wed, 17 Feb 2016 20:32:05 +0100 Subject: Classes, specializations, and statics In-Reply-To: References: <56C4A7D5.4010801@gmail.com> <56C4B286.9050200@oracle.com> Message-ID: <56C4CAB5.30800@gmail.com> On 02/17/2016 07:25 PM, Palo Marton wrote: > Another option is dissalow using T in static field declaration, but > allow use of T.erased, like this: > > class Collection { > private __SS Collection emptyCollection = ? > > private __SS Collection emptyCollection() { > return (Collection)emptyCollection; // warning > } > } > > On Wed, Feb 17, 2016 at 6:48 PM, Brian Goetz > wrote: > > Nice catch. Yes, this would cause heap pollution. As would the > following Java 5 code: > > private Collection collection = new ArrayList(); > public Collection c() { return (Collection) > collection; } > > The trouble is, the unchecked warning is in the library > implementation, not the user code. In the case of an immutable > collection, we happily suppress the warning knowing that > everything is safe. If we were returning a mutable collection, it > would be unsafe to suppress the warning, and doing so would be a > library bug. > > We have several ways out of this hole; one is to restrict > invocation, as you suggest. Another is to add some unchecked > warnings. (Another possible path is to treat signatures of erased > __SS members, when accessed from outside, as if they contained > capture variables.) > There are other problematic situations, like accessing _SS members from instance members: public class Foo { private _SS List list = new ArrayList<>(); public void add(T element) { list.add(element); } public T get(int i) { return list.get(i); } } Foo fooStr = new Foo<>(); Foo fooNum = new Foo<>(); fooStr.add("abc"); Number n = fooNum.get(0); ...or, by analogy, accessing _SS members from generic methods. Regards, Peter > > > > > > On 2/17/2016 12:03 PM, Peter Levart wrote: > > Hi Brian, > > On 02/15/2016 07:11 PM, Brian Goetz wrote: > > Example: > > class Collection { > private __SS Collection emptyCollection = ? > // ACC_SS field emptyCollection : ParamType[Collection, > TypeVar[T]] > > private __SS Collection emptyCollection() { return > emptyCollection; } > ACC_SS emptyCollection()ParamType[Collection, TypeVar[T]] { > getstatic ParamType[Collection, > TypeVar[T]].emptyCollection : ParamType[Collection, > TypeVar[T]]] > areturn > } > > When we specialize Collection, the field type, method > return type, etc, will all collapse to Collection by > the existing mechanisms. > > > This would work if the emptyCollection was actually empty and > immutable, but could you do the following: > > class Collection { > private __SS Collectioncollection = new ArrayList(); > > public __SS Collection collection() { return collection; } > } > > > And then in code: > > Collection cs = Collection.collection(); > Collection cn = Collection.collection(); > > cs.add("abc"); > Number n = cn.iterator().next(); > > If cs and cn hold the same instance, we have introduced heap > corruption without compilation warnings. > > So I suppose in language you could only access the _SS members > in the following way: > > Collection.collection(); > Collection.collection(); > Collection.collection(); > Collection.collection(); > ... > > but not: > > Collection.collection(); > Collection.collection(); > ... > > > Like .class literals in the prototype. > > Regards, Peter > > > > From peter.levart at gmail.com Wed Feb 17 20:15:25 2016 From: peter.levart at gmail.com (Peter Levart) Date: Wed, 17 Feb 2016 21:15:25 +0100 Subject: Nestmates In-Reply-To: <56BF4A99.5020404@oracle.com> References: <569FE666.9070103@oracle.com> <201602122201.u1CM1BYo011221@d01av05.pok.ibm.com> <56BF4A99.5020404@oracle.com> Message-ID: <56C4D4DD.5050309@gmail.com> Hi, I still think there is an elegant symmetric configuration possible... On 02/13/2016 04:24 PM, Brian Goetz wrote: > > > On 2/12/2016 5:04 PM, Bjorn B Vardal wrote: >> >> 1. The Top<->Child handshake only needs to happen when the Child is >> loaded (which will load Top as a dependency), and access request >> from Child1 to Child2 is reduced to Child1->nestTop == >> Child2->nestTop. This means that we can fail immediately if the >> handshake fails during class loading, i.e. it should not be >> postponed until a private access request fails. Do you agree? >> > > I think we have some options here: > - We could fail fast, rejecting the class. > - We could simply load the class into a new nest containing only > itself; access control (in both directions) that would depend on > nestmate-ness would fail later. > > I think the choice depends on whether we expect to see failures here > solely because of attacks / broken compilers, or whether we can > imagine reasonable situations where such a condition could happen > through separate compilation. > >> 1. >> 2. The proposal assumes that nest mates are always derived from the >> same source file. This can be enforced by the Java compiler, but >> is it verifiable by the JVM? Both the source file attributes and >> class name can be set to whatever we want, which makes it >> undesirable for verification purposes. The question really has >> two sides: >> 1. Do nest mates have to be from the same source file? >> 2. If so, how do we verify it? >> > > In Java, this will likely be true, but I can imagine how other > languages would use this to assemble a nest from multiple separate > files. So I don't think we need to claim they must come from the same > file, nor enforce it-- we only need enforce the integrity of the > NestXxx attributes. > >> 1. >> 1. >> 2. Building on question 2, the solution appears to be that nest >> mates must be loaded by the same class loader. If not, someone >> can load their own class with the same name as a class from some >> nest, using a child class loader, which will pass the handshake, >> effectively giving the custom class complete access to that nest. >> > > Yes. Same loader, same package, same module, same protection domain. > These all seem reasonable constraints here. If the constraint is that nestmates can only come from the same module, then the verification need not be based on class names. Suppose javac generates a random nest id for each nest (say 128 bit UUID). Two classes are nest-mates if they belong to the same module *and* share the same nest id. Spoofing nest-id *and* packaging the class into the same module would be equally difficult as spoofing the class-name and packaging the class into the same module, wouldn't it? Module system already shields from introducing arbitrary classes into arbitrary modules, so the trust within a module is already established. This would also make it simple to define additional classes dynamically (classical classes - not anonymous) to join the nest or enable separate compilation. For example, if only (implicitly trusted) code from the same module could define classes in that module dynamically, only (implicitly trusted) code from the same module could add nest mates to the nest(s) from that module. Regards, Peter > >> -- >> Bj?rn V?rdal >> >> ----- Original message ----- >> From: Brian Goetz >> Sent by: "valhalla-spec-experts" >> >> To: valhalla-spec-experts at openjdk.java.net >> Cc: >> Subject: Nestmates >> Date: Wed, Jan 20, 2016 2:57 PM >> This topic is at the complete opposite end of the spectrum from >> topics >> we've been discussing so far. It's mostly an implementation >> story, and >> of particular interest to the compiler and VM implementers here. >> >> >> Background >> ---------- >> >> Since Java 1.1, the rules for accessibility when inner classes are >> involved at the language level are not fully aligned with those >> at the >> VM level. In particular, private and protected access from and >> to inner >> classes is stricter in the VM than in the language, meaning that in >> these cases, the static compiler emits an access bridge (access$000) >> which effectively downgrades the accessed member's accessibility to >> package. >> >> Access bridges have some disadvantages. They're ugly, but that's >> not a >> really big deal. They're imprecise; they allow wider-than-necessary >> access to the member. Again, this is not a huge deal on its own. >> But >> the real problem is the complexity of the compiler implementation >> when >> we add generic specialization to the story. >> >> Specialization adds a new category of cross-class accesses that are >> allowed at the language level but not at the VM level, which would >> dramatically increase the need for, and complexity of, accessibility >> bridges. For example: >> >> class Foo { >> private T t; >> >> void m(Foo foo) { >> int i = foo.t; >> } >> } >> >> Now we execute: >> >> Foo fl = ... >> Foo fi = ... >> fl.m(fi) >> >> The spirit of the language rules clearly allow the access from >> Foo >> to Foo.t -- they are in the "same class". But at the VM level, >> Foo and Foo are different classes, so the access from >> Foo to a private member of Foo is disallowed. >> >> One reason that this increases the complexity, and not just the >> number, >> of accessibility bridges is that bridges are (currently) static >> methods; >> if they represent instance methods, we pass the receiver as the first >> argument. For access between inner classes, this is fine, but >> when it >> comes to access between specializations, this breeds new >> complexity -- >> because the method signature of the accessor needs to be specialized >> based on the type parameters of the receiver. This interaction means >> the current static-accessor solution would need its own special, >> ad-hoc >> treatment in specialization, adding to the complexity of >> specialization. >> >> More generally, this situation arises in any case where a single >> logical >> unit of encapsulation at the source level is split into multiple >> runtime >> classes (inner classes, specialization classes, synthetic helper >> classes.) We propose to address this problem more generally, by >> providing a mechanism where language compilers can indicate that >> multiple runtime classes live in the same unit of encapsulation. >> We do >> so by (a) adding metadata to classes to indicate which classes >> belong in >> the same encapsulation unit and (b) relaxing some VM >> accessibility rules >> to bring them more in alignment with the language level rules. >> >> >> Overview >> -------- >> >> Our proposed strategy is to reify the relationship between >> classes that >> are members of the same _nest_. Nestmate-ness can then be >> considered in >> access control decisions (JVMS 5.4.4). >> >> Classes that derive from a common source class form a _nest_, and two >> classes in the same nest are called _nestmates_. Nestmate-ness is an >> equivalence relation (reflexive, symmetric, and transitive.) >> Nestmates >> of a class C include C's inner classes, synthetic classes >> generated as >> part of translating C, and specializations thereof. >> >> Since nestmate-ness is an equivalence relation, it forms a partition >> over classes, and we can nominate a canonical member for each >> partition. >> We nominate the "top" (outermost lexically enclosing) class in the >> nest as the canonical member; this is the top-level source class from >> which all other nestmates derive. >> >> This makes it easy to calculate nestmate-ness for two classes C >> and D; C >> and D are nestmates if their "top" class is the same. >> >> Example >> ------- >> >> class Top { >> class A { } >> class B { } >> } >> >> void genericMethod() { } >> } >> >> When we compile this, we get: >> Top.class // Top >> Top$A.class // Inner class Top.A >> Top$A$B.class // Inner class Top.A.B >> Top$Any.class // Wildcard interface for Top >> Top$A$Any.class // Wildcard interface for Top.A >> Top$genericMethod.class // Holder class for generic method >> >> The explicit classes Top, Top.A, and Top.A.B, the synthetic $Any >> classes, and the synthetic holder class for genericMethod, along with >> all of their specializations, form a nest. The top member of >> this nest >> is Top. >> >> Since nestmates all derive from a common top-level class, they are by >> definition in the same package and module. A class can be in >> only one >> nest at once. >> >> >> Runtime Representation >> ---------------------- >> >> We represent nestmate-ness with two new attributes -- one in the top >> member, which describes all the members of the nest, and one in each >> member, which requests access to the nest. >> >> NestTop { >> u2 name_index; >> u4 length; >> u2 child_count; >> u2 childClazz[child_count]; >> } >> >> NestChild { >> u2 name_index; >> u4 length; >> u2 topClazz; >> } >> >> If a class has a NestTop attribute, its nest top is itself. If a >> class >> has a NestChild attribute, its nest top is the class named via >> topClazz. >> If a class is a specialization of another class, its nest top is the >> nest top of the class for which it is a specialization. >> >> When loading a class with a NestChild attribute, the VM can >> verify that >> the requested nest permits it as a member, and reject the class >> if the >> child and top do not agree. >> >> The NestTop attribute can enumerate all inner classes and synthetic >> classes, but cannot enumerate all specializations thereof. When >> creating >> a specialization of a class, the VM records the specialization as >> being >> a member of whatever nest the template class was a member of. >> >> >> Semantics >> --------- >> >> The accessibility rules here are strictly additions; nestmate-ness >> creates additional accessibility over and above the existing rules. >> >> Informally: >> - A class can access the private members of its nestmates; >> - A class can access protected members inherited by its nestmates. >> >> This is slightly broader than the language semantics (but still less >> broad than what we do today with access bridges.) The static >> compiler >> can continue to enforce the same rules, and the VM will allow these >> accesses without bridges. (We could make the proposal match the >> language semantics more closely at the cost of additional complexity, >> but its not clear this is worthwhile.) >> >> For private access, we can add the following to 5.4.4: >> - A class C may access a private member D.R if C and D are >> nestmates. >> >> The rules for protected members are more complicated. >> 5.4.3.{2,3} first >> resolve the true owner of the member, and feed that to 5.4.4; this >> process throws away some needed information. We would augment >> 5.4.3.{2,3} as follows: >> - When performing member resolution from class C on member D.R, we >> remember both D (the target class) and E (the resolved class) and >> make >> them both available to 5.4.4. >> >> We then adjust 5.4.4 accordingly, by adding: >> - If R is protected, and C and D are nestmates, and E is >> accessible to >> D, then access is allowed. >> >> >> Examples >> -------- >> >> For private fields, we generate access bridges whenever an inner >> class >> accesses a private member (field or method) of the enclosing >> class, or >> of another inner class in the same nest. >> >> In the classes below, the accesses shown are all permitted by the >> language spec (child to parent, sibling to sibling, sibling to >> child of >> sibling, etc), and the ones requiring access bridges are noted. >> >> class Foo { >> public static Foo aFoo; >> public static Inner1 aInner1; >> public static Inner1.Inner2 aInner2; >> public static Inner3 aInner3; >> >> private int foo; >> >> class Inner1 { >> private int inner1; >> >> class Inner2 { >> private int inner2; >> } >> >> void m() { >> int i = aFoo.foo // bridge >> + aInner1.inner1 >> + aInner2.inner2 // bridge >> + aInner3.inner3; // bridge >> } >> } >> >> class Inner3 { >> private int inner3; >> >> void m() { >> int i = aFoo.foo // bridge >> + aInner1.inner1 // bridge >> + aInner2.inner2 // bridge >> + aInner3.inner3; >> } >> } >> } >> >> For protected members, the situation is more subtle. >> >> /* package p1 */ >> public class Sup { >> protected int pro; >> } >> >> /* package p2 */ >> public class Sub extends p1.Sup { >> void test() { >> ... pro ... //no bridge (invokespecial) >> } >> >> class Inner { >> void test() { >> ... sub.pro ... // bridge generated in Sub >> } >> } >> } >> >> Here, the VM rules allow Sub to access protected members of Sup, >> but for >> accesses from Sub.Inner or Sibling to Sub.pro to succeed, Sub >> provides >> an access bridge (which effectively makes Sub.pro package-visible >> throughout package p2.) >> >> The rules outlined eliminate access bridges in all of these cases. >> >> >> Interaction with defineAnonymousClass >> ------------------------------------- >> >> Nestmate-ness also potentially connects nicely with >> Unsafe.defineAnonymousClass. The intuitive notion of dAC is, >> when you >> load anonymous class C with a host class of H, that C is being >> "injected >> into" H -- access control decisions for C are made using H's >> credentials. With a formal notion of nestmateness, we can bring >> additional predictability to dAC by saying that C is injected >> into H's >> nest. >> >> > From john.r.rose at oracle.com Thu Feb 18 18:10:23 2016 From: john.r.rose at oracle.com (John Rose) Date: Thu, 18 Feb 2016 10:10:23 -0800 Subject: Nestmates In-Reply-To: <56C4D4DD.5050309@gmail.com> References: <569FE666.9070103@oracle.com> <201602122201.u1CM1BYo011221@d01av05.pok.ibm.com> <56BF4A99.5020404@oracle.com> <56C4D4DD.5050309@gmail.com> Message-ID: <9B232D44-4565-4C35-8AF6-4405403A35BF@oracle.com> On Feb 17, 2016, at 12:15 PM, Peter Levart wrote: > > Suppose javac generates a random nest id for each nest (say 128 bit UUID). Two classes are nest-mates if they belong to the same module *and* share the same nest id. There are two parts to this proposal: 1. New naming convention for nests, based on UUIDs. This is a new concept in the JVM, and would require new infrastructure to manage (generate, transcode, verify, reflect, debug). That means new bugs and new attack surfaces. In the absence of a decisive benefit, it's better to reuse existing name spaces, and (in particular) the JVM's type name dictionary. 2. Unidirectional links. The UUID, being a pure identity with no content, does not contain a list of its nestlings. The nestlings point to the nest (via the UUID). Any class can inject itself into a nest (in the same package) simply by mentioning the appropriate UUID. Unidirectional linkage means that there is no way to enumerate a nest. This complicates some optimizations (based on sealed types). Security and seal-ability of nests is reduced to that of packages. PRIVATE becomes just an alias for default-scope access control. Sorry, but neither part of this is appealing to me, compared with the current proposal. ? John From brian.goetz at oracle.com Fri Feb 19 00:54:53 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 18 Feb 2016 19:54:53 -0500 Subject: Classes, specializations, and statics In-Reply-To: <201602182155.u1ILtbE6013161@d01av03.pok.ibm.com> References: <201602182155.u1ILtbE6013161@d01av03.pok.ibm.com> Message-ID: <56C667DD.6020601@oracle.com> > Based on the example above, I think we need to be more explicit about > how the method is handled. > There are really two different sets of statics that need to be handled > by the class initialization: > A) common statics (shared across all instantiations) > B) specialized statics > In addition to the statics, there is also common (and maybe > specialized?) code that is run as part of . There is a reasonable model to collapse these back into one concept; treat "common statics" as specialized statics on the all-erased parameterization, with a clause that restricts them to that parameterization. Not clear whether we actually want to represent it that way or not, but its a useful mental model that doesn't require the creation of a third thing. (Since Class[Foo] and ParamType[Foo,erased*] describe the same class, this is also fully binary compatible with existing classes.) Which means we can do a similar thing with , if we want. I'll wave my hands because we've not yet talked much about conditional members, but it basically looks like this: () { /* common static init code */ /* specializable init code */ } () { /* specializable init code */ } Or not. > Where will the initialization code for both kinds of statics be? The > existing method? We have two choices: - have a new block that gets run once per specialization, and keep - merge the two as above, exploiting planned support for conditional members Either way, as you say, we have to ensure that the common init runs exactly once. > When using *static, are we only discussing {get,put}? Or is this also > proposing invokestatic changes to allow specialized static methods? Methods too. > All of the technical details aside, is this something we really want > to expose to the users? They're going to have a hard time > understanding why Foo (or Foo while Foo & Foo share the erased version. I think this is mostly a matter of coming up with the right syntax, which makes it clear that statics can be per-class or per-specialization. There are a whole pile of related specialization-related syntax issues, I'll try to get them all in one place. From simon at ochsenreither.de Sat Feb 20 19:01:27 2016 From: simon at ochsenreither.de (Simon Ochsenreither) Date: Sat, 20 Feb 2016 20:01:27 +0100 (CET) Subject: Classes, specializations, and statics In-Reply-To: References: Message-ID: <1747570325.78929.1455994887523.JavaMail.open-xchange@srv005.service.ps-server.net> I'm very concerned about this proposal due to the mental overhead it incurs on unsuspecting users. While I agree that there are certain things which would be not possible/inconvenient without this proposal, I think the disadvantages severely out-weigh any benefits. The way erasure was done allowed developers to write code without considering or needing the know the inner workings of how generics where implemented, except the occasional "why can't I have overloaded methods foo(List) and foo(List)"? Developers had instance-level members and class-level members. This proposal forces every developer to open up the engine cover and understand all the implications of introducing another level between instance-level members and class-level members (and all the interactions with javac's ongoing erasure of reference types, which means sometimes these specialization-level members collapse into class-level members (reference types) and sometimes they don't (value types)). C# introduced something like this in C# 2 and I think from an ergonomic, end-user perspective it has been a failure. Considering that Java needs to support both "static" modes for all eternity, and unlike C#, Java still supports accessing static members from instances, this will be a huge mess. >From a Scala POV, I have no idea how this could be integrated properly, given that the language has vastly simplified rules and a clear distinction where members need to live. It would probably be one of those things which would end up with no native support in the language, turning it from "new feature" into "legacy stuff only supported for Java interop" right upon release. Cheers, Simon From brian.goetz at oracle.com Tue Feb 23 00:23:16 2016 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 22 Feb 2016 19:23:16 -0500 Subject: Classes, specializations, and statics In-Reply-To: <201602222111.u1MLBcpn021902@d03av01.boulder.ibm.com> References: <201602222111.u1MLBcpn021902@d03av01.boulder.ibm.com> Message-ID: <56CBA674.9030301@oracle.com> It's possible that there could be multiple "ssinit" methods, each restricted to specific parameterizations (just like any other restricted method), but in general, the "ssinit" method can be specialized just like any other method. So what I envision (in the absence of initialization of conditional members) is possibly two such methods; one that is specializable (corresponding to _SS members) and one that is not, restricted to the erased parameterization (corresponding to traditional statics.) On 2/22/2016 4:11 PM, Bjorn B Vardal wrote: > I think we're on the same page regarding specialized . > - The JVM will be handed multiple partial methods, and the > specializer will take care of selecting the appropriate for > each specialization. > - The erased will contain the non-specialized static > initialization code, which ensures that it only runs once. > - The erased will always run before the first specialization > . > - The Java syntax is still up for discussion. > > I think this is mostly a matter of coming up with the right syntax, > which makes it clear that statics can be per-class or > per-specialization. There are a whole pile of related > specialization-related syntax issues, I'll try to get them all in one > place. > I don't think the problem will be to make it clear that statics can be > per-class or per-specialization, but rather why some parameterizations > (which to the user are synonymous with specializations) don't appear > to have specialized statics. Do we want to put erasure in the face of > users like this? It seems better to let the users deal purely with > parameterizations, and we let specialization and erasure be > implementation details. > -- > Bj?rn V?rdal > > ----- Original message ----- > From: Brian Goetz > To: Bjorn B Vardal/Ottawa/IBM at IBMCA, > valhalla-spec-experts at openjdk.java.net > Cc: > Subject: Re: Classes, specializations, and statics > Date: Thu, Feb 18, 2016 7:55 PM > > >> Based on the example above, I think we need to be more explicit >> about how the method is handled. >> There are really two different sets of statics that need to be >> handled by the class initialization: >> A) common statics (shared across all instantiations) >> B) specialized statics >> In addition to the statics, there is also common (and maybe >> specialized?) code that is run as part of . > > There is a reasonable model to collapse these back into one > concept; treat "common statics" as specialized statics on the > all-erased parameterization, with a clause that restricts > them to that parameterization. Not clear whether we actually want > to represent it that way or not, but its a useful mental model > that doesn't require the creation of a third thing. (Since > Class[Foo] and ParamType[Foo,erased*] describe the same class, > this is also fully binary compatible with existing classes.) > > Which means we can do a similar thing with , if we want. > I'll wave my hands because we've not yet talked much about > conditional members, but it basically looks like this: > > > () { /* common static init code */ > /* specializable init code */ } > > () { /* specializable init code */ } > > Or not. >> Where will the initialization code for both kinds of statics be? >> The existing method? > > We have two choices: > - have a new block that gets run once per > specialization, and keep > - merge the two as above, exploiting planned support for > conditional members > > Either way, as you say, we have to ensure that the common init > runs exactly once. >> When using *static, are we only discussing {get,put}? Or is this >> also proposing invokestatic changes to allow specialized static >> methods? > > Methods too. >> All of the technical details aside, is this something we really >> want to expose to the users? They're going to have a hard time >> understanding why Foo (or Foo> statics while Foo & Foo share the erased version. > > I think this is mostly a matter of coming up with the right > syntax, which makes it clear that statics can be per-class or > per-specialization. There are a whole pile of related > specialization-related syntax issues, I'll try to get them all in > one place. > > From forax at univ-mlv.fr Tue Feb 23 19:56:55 2016 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 23 Feb 2016 20:56:55 +0100 (CET) Subject: Classes, specializations, and statics In-Reply-To: <56CBA674.9030301@oracle.com> References: <201602222111.u1MLBcpn021902@d03av01.boulder.ibm.com> <56CBA674.9030301@oracle.com> Message-ID: <1395529931.1029351.1456257415180.JavaMail.zimbra@u-pem.fr> I wonder if it's not better to have a class like ThreadLocal or ClassValue that represents a constant that can be different depending on the specialization. R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "Bjorn B Vardal" > Cc: valhalla-spec-experts at openjdk.java.net > Envoy?: Mardi 23 F?vrier 2016 01:23:16 > Objet: Re: Classes, specializations, and statics > > It's possible that there could be multiple "ssinit" methods, each > restricted to specific parameterizations (just like any other restricted > method), but in general, the "ssinit" method can be specialized just > like any other method. So what I envision (in the absence of > initialization of conditional members) is possibly two such methods; one > that is specializable (corresponding to _SS members) and one that is > not, restricted to the erased parameterization (corresponding to > traditional statics.) > > > > On 2/22/2016 4:11 PM, Bjorn B Vardal wrote: > > I think we're on the same page regarding specialized . > > - The JVM will be handed multiple partial methods, and the > > specializer will take care of selecting the appropriate for > > each specialization. > > - The erased will contain the non-specialized static > > initialization code, which ensures that it only runs once. > > - The erased will always run before the first specialization > > . > > - The Java syntax is still up for discussion. > > > I think this is mostly a matter of coming up with the right syntax, > > which makes it clear that statics can be per-class or > > per-specialization. There are a whole pile of related > > specialization-related syntax issues, I'll try to get them all in one > > place. > > I don't think the problem will be to make it clear that statics can be > > per-class or per-specialization, but rather why some parameterizations > > (which to the user are synonymous with specializations) don't appear > > to have specialized statics. Do we want to put erasure in the face of > > users like this? It seems better to let the users deal purely with > > parameterizations, and we let specialization and erasure be > > implementation details. > > -- > > Bj?rn V?rdal > > > > ----- Original message ----- > > From: Brian Goetz > > To: Bjorn B Vardal/Ottawa/IBM at IBMCA, > > valhalla-spec-experts at openjdk.java.net > > Cc: > > Subject: Re: Classes, specializations, and statics > > Date: Thu, Feb 18, 2016 7:55 PM > > > > > >> Based on the example above, I think we need to be more explicit > >> about how the method is handled. > >> There are really two different sets of statics that need to be > >> handled by the class initialization: > >> A) common statics (shared across all instantiations) > >> B) specialized statics > >> In addition to the statics, there is also common (and maybe > >> specialized?) code that is run as part of . > > > > There is a reasonable model to collapse these back into one > > concept; treat "common statics" as specialized statics on the > > all-erased parameterization, with a clause that restricts > > them to that parameterization. Not clear whether we actually want > > to represent it that way or not, but its a useful mental model > > that doesn't require the creation of a third thing. (Since > > Class[Foo] and ParamType[Foo,erased*] describe the same class, > > this is also fully binary compatible with existing classes.) > > > > Which means we can do a similar thing with , if we want. > > I'll wave my hands because we've not yet talked much about > > conditional members, but it basically looks like this: > > > > > > () { /* common static init code */ > > /* specializable init code */ } > > > > () { /* specializable init code */ } > > > > Or not. > >> Where will the initialization code for both kinds of statics be? > >> The existing method? > > > > We have two choices: > > - have a new block that gets run once per > > specialization, and keep > > - merge the two as above, exploiting planned support for > > conditional members > > > > Either way, as you say, we have to ensure that the common init > > runs exactly once. > >> When using *static, are we only discussing {get,put}? Or is this > >> also proposing invokestatic changes to allow specialized static > >> methods? > > > > Methods too. > >> All of the technical details aside, is this something we really > >> want to expose to the users? They're going to have a hard time > >> understanding why Foo (or Foo >> statics while Foo & Foo share the erased version. > > > > I think this is mostly a matter of coming up with the right > > syntax, which makes it clear that statics can be per-class or > > per-specialization. There are a whole pile of related > > specialization-related syntax issues, I'll try to get them all in > > one place. > > > > > >