Nestmates
John Rose
john.r.rose at oracle.com
Tue Feb 2 20:35:40 UTC 2016
On Jan 31, 2016, at 11:52 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>
>> First, I think it might make sense to share the use cases we have for nestmates in Kotlin (those that will work out):
>> - same as Java: nested/inner classes
>> - multi-file classes (this is how we emulate free functions that are logically direct members of a package in Kotlin)
>
> Nestmates should handle these cases. Essentially, we are redefining the logical “class” boundary to span multiple physical classes.
The term "nest" strongly suggests inner classes, which is the original, 20-year-old motivating use case for extending private barriers beyond single class-files.
But the current proposal is more flexible, as Andrey has noticed. Specifically, it is decoupled from the syntax of Java inner/nested classes. This makes it easier to implement correctly, which is appropriate for a run-time access control mechanism. It is appropriate for the JVM to define simple trust areas (package, nest, module, class-loader) without coupling them too closely to language semantics. (ACC_PROTECTED is an exception. It would seem to be difficult to use for anything except Java and Java-like languages.)
If it helps to think about it, we can shift the metaphor from "nestmate" to "friend". In that case, the "top class" is (let us say) really the "privacy club" a bunch of "privacy friends" belongs to. (Or "relative/family", etc., etc.) We are talking about the JVM connecting the control privacy with a well-defined partition or equivalence class, as represented (securely) by the proposed bidirectional attribute links. (The set of nest-tops is a section[1].) The advantage of "top class" is it is clear how we intend for the JLS to map to the JVM. The slight disadvantage is it might make the unwary fixate on that one use case. But, hey, a little Java-centricity doesn't hurt here.
[1]: https://en.wikipedia.org/wiki/Section_(category_theory)
For a multi-file trust unit you might separately generate the file which defines the "privacy club" (or represents the equivalence class) when you generate the multiple files that comprise the privacy group. You might even re-generate the "privacy club" file if you incrementally compile new files for the same trust unit. (This would work best if the control file contains nothing except the NestTop attribute and its associated constants.) The JVM doesn't care about compilation policy as long as the details are settled before the first file of the trust unit is loaded.
And of course this mechanism can be useful even for languages which don't support class nesting (or use cases which don't need nesting). I'm thinking of Scala case-classes here, as well as Kotlin sealed classes (if you subtract the nesting constraint), and JVM enforcement of sealed interfaces.
>
>> Now, there's one issue that does not seem entirely clear to me: does this proposal imply making nested classes truly private? It does not mention allowing ACC_PRIVATE on classes, so I'm not sure whether this was intended.
>> In any case it would make sense, I think. I haven't given it much thought yet, but we could probably legalize the ACC_PRIVATE flag on classes that have a NestChild entry, and check that they are only accessed from their nestmates, right?
>
> We haven't thought about this too deeply, but it does seem within the spirit of the proposal.
My take: If we define a rigorous, reliable nest-mate relation in the JVM, we can also start to mark classes ACC_PRIVATE and enforce the narrower access. I think this is desirable, as a way to harden clusters of inner classes within large packages. Similar (though much less important) point for ACC_PROTECTED.
Synthetic classes (generated as "helpers" for a particular compilation unit) should also be marked private. For example, if the NestTop file is separated (for whatever reason) from the syntactically top-level class of a class nest, it can be marked both synthetic and private.
class C { class D { } private class E { } }
==>
classfile C { NestChild(C$NestTop$) }
classfile D { NestChild(C$NestTop$) }
classfile E { NestChild(C$NestTop$); access_flags(ACC_PRIVATE) }
classfile C$NestTop$ { NestTop(C, D, E); access_flags(ACC_PRIVATE+ACC_SYNTHETIC) }
>
>> (This would only work with Java's every-which-way treatment of access between the nested classes: in Kotlin, for example, nested classes can not access private members of their enclosing class, but such extra restrictions don't seem to be a security concern, because all these classes are in the same compilation unit anyway.)
>
> Right. For example, the rules in Java for protected are somewhat more restricted than for private, and the proposed VM rules are more permissive, but the language compiler can always enforce stricter rules.
The every-which-way aspect is looser than many language-specific rules, but it is not too loose for a useful JVM-level view of enforceable run-time privacy. It is loose enough to serve as a carrier for a range of language-specific conventions, and tight enough to enforce (at run-time, not compile-time) a useful set of security invariants.
(Personally, from a language design standpoint, I think enforcing privacy boundaries *within* a single compilation unit makes for useless effort and silly speed-bumps for coders. But that's an esthetic thing Java and Kotlin can agree to differ on.)
Should nestmate relations be allowed across package boundaries? And what about modules and class loaders? (Compare C++ friend relations across namespaces.) I think the answer should be "no" at least at first. (We could relax to some form of "yes" later.) The reason is the JVM's model for access control is intended to be easy to understand and to implement correctly. (Put down that hand, ACC_PROTECTED!) Part of the simplicity is that a package (and a module) is a unit of encapsulation that can only be broken by sharing secrets, which (when it happens) is pretty obvious, as it requires explicit code to be written. Allowing a semi-invisible nestmate attribute to create a "wormhole" of access between packages or modules would be a cute tool for creating shared-secret design patterns, but a more explicit one would be just as workable, and easier to observe in source code when it happens. (Language compilers can sugar it up however they want.) For modules, the mechanism of qualified exports gives a large-scale tool to implement this pattern. Capabilities (e.g., lambdas or Lookup or MHs) give more precise tools. There is no need for nestmates to provide yet another tool, at the cost of disturbing the simple nesting relation of nest/private < package/default < module/public.
In a nutshell, the JVM class loader should enforce the structural constraint that NestTop and NestChild attributes should only refer to classes with the same package prefix as the containing class.
Another fine point: Should a single class file be allowed to have both a NestTop and a NestChild attribute? Answer: No. It is no burden on static compilers to "roll up" the whole list from a complicated hierarchy into one place. (What if there are 10,000 types in a nest? In that case we need to get on the ball and expand constant pool sizes. *But* the fancy expanded constant pool would *only* exist in the nest-top file, so a more ad hoc solution could be created just for the NestTop attribute—if we get there. That would be preferable to allowing a degrees of freedom—nested nests—that would in practice almost never be used.)
— John
P.S. Brian, thank you for creating this proposal and circulating it. I've agitated for normalizing nestmate relations in just about every Java release, and am glad to see we have reached the tipping point. And, a separate top-class attribute is far better than where I thought we'd end up: walking the InnerClasses attribute.
More information about the valhalla-spec-observers
mailing list