Nestmates
Brian Goetz
brian.goetz at oracle.com
Sat Feb 13 15:24:09 UTC 2016
On 2/12/2016 5:04 PM, Bjorn B Vardal wrote:
>
> 1. The Top<->Child handshake only needs to happen when the Child is
> loaded (which will load Top as a dependency), and access request
> from Child1 to Child2 is reduced to Child1->nestTop ==
> Child2->nestTop. This means that we can fail immediately if the
> handshake fails during class loading, i.e. it should not be
> postponed until a private access request fails. Do you agree?
>
I think we have some options here:
- We could fail fast, rejecting the class.
- We could simply load the class into a new nest containing only
itself; access control (in both directions) that would depend on
nestmate-ness would fail later.
I think the choice depends on whether we expect to see failures here
solely because of attacks / broken compilers, or whether we can imagine
reasonable situations where such a condition could happen through
separate compilation.
> 1.
> 2. The proposal assumes that nest mates are always derived from the
> same source file. This can be enforced by the Java compiler, but
> is it verifiable by the JVM? Both the source file attributes and
> class name can be set to whatever we want, which makes it
> undesirable for verification purposes. The question really has two
> sides:
> 1. Do nest mates have to be from the same source file?
> 2. If so, how do we verify it?
>
In Java, this will likely be true, but I can imagine how other languages
would use this to assemble a nest from multiple separate files. So I
don't think we need to claim they must come from the same file, nor
enforce it-- we only need enforce the integrity of the NestXxx attributes.
> 1.
> 1.
> 2. Building on question 2, the solution appears to be that nest mates
> must be loaded by the same class loader. If not, someone can load
> their own class with the same name as a class from some nest,
> using a child class loader, which will pass the handshake,
> effectively giving the custom class complete access to that nest.
>
Yes. Same loader, same package, same module, same protection domain.
These all seem reasonable constraints here.
> --
> Bjørn Vårdal
>
> ----- Original message -----
> From: Brian Goetz <brian.goetz at oracle.com>
> Sent by: "valhalla-spec-experts"
> <valhalla-spec-experts-bounces at openjdk.java.net>
> To: valhalla-spec-experts at openjdk.java.net
> Cc:
> Subject: Nestmates
> Date: Wed, Jan 20, 2016 2:57 PM
> This topic is at the complete opposite end of the spectrum from topics
> we've been discussing so far. It's mostly an implementation
> story, and
> of particular interest to the compiler and VM implementers here.
>
>
> Background
> ----------
>
> Since Java 1.1, the rules for accessibility when inner classes are
> involved at the language level are not fully aligned with those at the
> VM level. In particular, private and protected access from and to
> inner
> classes is stricter in the VM than in the language, meaning that in
> these cases, the static compiler emits an access bridge (access$000)
> which effectively downgrades the accessed member's accessibility to
> package.
>
> Access bridges have some disadvantages. They're ugly, but that's
> not a
> really big deal. They're imprecise; they allow wider-than-necessary
> access to the member. Again, this is not a huge deal on its own. But
> the real problem is the complexity of the compiler implementation when
> we add generic specialization to the story.
>
> Specialization adds a new category of cross-class accesses that are
> allowed at the language level but not at the VM level, which would
> dramatically increase the need for, and complexity of, accessibility
> bridges. For example:
>
> class Foo<any T> {
> private T t;
>
> void m(Foo<int> foo) {
> int i = foo.t;
> }
> }
>
> Now we execute:
>
> Foo<long> fl = ...
> Foo<int> fi = ...
> fl.m(fi)
>
> The spirit of the language rules clearly allow the access from
> Foo<long>
> to Foo<int>.t -- they are in the "same class". But at the VM level,
> Foo<int> and Foo<long> are different classes, so the access from
> Foo<long> to a private member of Foo<int> is disallowed.
>
> One reason that this increases the complexity, and not just the
> number,
> of accessibility bridges is that bridges are (currently) static
> methods;
> if they represent instance methods, we pass the receiver as the first
> argument. For access between inner classes, this is fine, but when it
> comes to access between specializations, this breeds new complexity --
> because the method signature of the accessor needs to be specialized
> based on the type parameters of the receiver. This interaction means
> the current static-accessor solution would need its own special,
> ad-hoc
> treatment in specialization, adding to the complexity of
> specialization.
>
> More generally, this situation arises in any case where a single
> logical
> unit of encapsulation at the source level is split into multiple
> runtime
> classes (inner classes, specialization classes, synthetic helper
> classes.) We propose to address this problem more generally, by
> providing a mechanism where language compilers can indicate that
> multiple runtime classes live in the same unit of encapsulation.
> We do
> so by (a) adding metadata to classes to indicate which classes
> belong in
> the same encapsulation unit and (b) relaxing some VM accessibility
> rules
> to bring them more in alignment with the language level rules.
>
>
> Overview
> --------
>
> Our proposed strategy is to reify the relationship between classes
> that
> are members of the same _nest_. Nestmate-ness can then be
> considered in
> access control decisions (JVMS 5.4.4).
>
> Classes that derive from a common source class form a _nest_, and two
> classes in the same nest are called _nestmates_. Nestmate-ness is an
> equivalence relation (reflexive, symmetric, and transitive.)
> Nestmates
> of a class C include C's inner classes, synthetic classes generated as
> part of translating C, and specializations thereof.
>
> Since nestmate-ness is an equivalence relation, it forms a partition
> over classes, and we can nominate a canonical member for each
> partition.
> We nominate the "top" (outermost lexically enclosing) class in the
> nest as the canonical member; this is the top-level source class from
> which all other nestmates derive.
>
> This makes it easy to calculate nestmate-ness for two classes C
> and D; C
> and D are nestmates if their "top" class is the same.
>
> Example
> -------
>
> class Top<any T> {
> class A<any U> { }
> class B<V> { }
> }
>
> <any T> void genericMethod() { }
> }
>
> When we compile this, we get:
> Top.class // Top
> Top$A.class // Inner class Top.A
> Top$A$B.class // Inner class Top.A.B
> Top$Any.class // Wildcard interface for Top
> Top$A$Any.class // Wildcard interface for Top.A
> Top$genericMethod.class // Holder class for generic method
>
> The explicit classes Top, Top.A, and Top.A.B, the synthetic $Any
> classes, and the synthetic holder class for genericMethod, along with
> all of their specializations, form a nest. The top member of this
> nest
> is Top.
>
> Since nestmates all derive from a common top-level class, they are by
> definition in the same package and module. A class can be in only one
> nest at once.
>
>
> Runtime Representation
> ----------------------
>
> We represent nestmate-ness with two new attributes -- one in the top
> member, which describes all the members of the nest, and one in each
> member, which requests access to the nest.
>
> NestTop {
> u2 name_index;
> u4 length;
> u2 child_count;
> u2 childClazz[child_count];
> }
>
> NestChild {
> u2 name_index;
> u4 length;
> u2 topClazz;
> }
>
> If a class has a NestTop attribute, its nest top is itself. If a class
> has a NestChild attribute, its nest top is the class named via
> topClazz.
> If a class is a specialization of another class, its nest top is the
> nest top of the class for which it is a specialization.
>
> When loading a class with a NestChild attribute, the VM can verify
> that
> the requested nest permits it as a member, and reject the class if the
> child and top do not agree.
>
> The NestTop attribute can enumerate all inner classes and synthetic
> classes, but cannot enumerate all specializations thereof. When
> creating
> a specialization of a class, the VM records the specialization as
> being
> a member of whatever nest the template class was a member of.
>
>
> Semantics
> ---------
>
> The accessibility rules here are strictly additions; nestmate-ness
> creates additional accessibility over and above the existing rules.
>
> Informally:
> - A class can access the private members of its nestmates;
> - A class can access protected members inherited by its nestmates.
>
> This is slightly broader than the language semantics (but still less
> broad than what we do today with access bridges.) The static compiler
> can continue to enforce the same rules, and the VM will allow these
> accesses without bridges. (We could make the proposal match the
> language semantics more closely at the cost of additional complexity,
> but its not clear this is worthwhile.)
>
> For private access, we can add the following to 5.4.4:
> - A class C may access a private member D.R if C and D are
> nestmates.
>
> The rules for protected members are more complicated. 5.4.3.{2,3}
> first
> resolve the true owner of the member, and feed that to 5.4.4; this
> process throws away some needed information. We would augment
> 5.4.3.{2,3} as follows:
> - When performing member resolution from class C on member D.R, we
> remember both D (the target class) and E (the resolved class) and make
> them both available to 5.4.4.
>
> We then adjust 5.4.4 accordingly, by adding:
> - If R is protected, and C and D are nestmates, and E is
> accessible to
> D, then access is allowed.
>
>
> Examples
> --------
>
> For private fields, we generate access bridges whenever an inner class
> accesses a private member (field or method) of the enclosing class, or
> of another inner class in the same nest.
>
> In the classes below, the accesses shown are all permitted by the
> language spec (child to parent, sibling to sibling, sibling to
> child of
> sibling, etc), and the ones requiring access bridges are noted.
>
> class Foo {
> public static Foo aFoo;
> public static Inner1 aInner1;
> public static Inner1.Inner2 aInner2;
> public static Inner3 aInner3;
>
> private int foo;
>
> class Inner1 {
> private int inner1;
>
> class Inner2 {
> private int inner2;
> }
>
> void m() {
> int i = aFoo.foo // bridge
> + aInner1.inner1
> + aInner2.inner2 // bridge
> + aInner3.inner3; // bridge
> }
> }
>
> class Inner3 {
> private int inner3;
>
> void m() {
> int i = aFoo.foo // bridge
> + aInner1.inner1 // bridge
> + aInner2.inner2 // bridge
> + aInner3.inner3;
> }
> }
> }
>
> For protected members, the situation is more subtle.
>
> /* package p1 */
> public class Sup {
> protected int pro;
> }
>
> /* package p2 */
> public class Sub extends p1.Sup {
> void test() {
> ... pro ... //no bridge (invokespecial)
> }
>
> class Inner {
> void test() {
> ... sub.pro ... // bridge generated in Sub
> }
> }
> }
>
> Here, the VM rules allow Sub to access protected members of Sup,
> but for
> accesses from Sub.Inner or Sibling to Sub.pro to succeed, Sub provides
> an access bridge (which effectively makes Sub.pro package-visible
> throughout package p2.)
>
> The rules outlined eliminate access bridges in all of these cases.
>
>
> Interaction with defineAnonymousClass
> -------------------------------------
>
> Nestmate-ness also potentially connects nicely with
> Unsafe.defineAnonymousClass. The intuitive notion of dAC is, when you
> load anonymous class C with a host class of H, that C is being
> "injected
> into" H -- access control decisions for C are made using H's
> credentials. With a formal notion of nestmateness, we can bring
> additional predictability to dAC by saying that C is injected into H's
> nest.
>
>
More information about the valhalla-spec-observers
mailing list