RI update: division of bridging responsibility between VM and compiler
Brian Goetz
brian.goetz at oracle.com
Thu May 2 14:25:18 PDT 2013
I like the idea of a "missing bridge detection tool." This seems
entirely tractible. This also connects with another idea that was
discussed earlier, namely, a migration tool to tell you which defaults
were added to which interfaces from one version of a JAR to another, so
that subclassers can identify which new methods they want to override.
That said, I wouldn't be too scared about the cases under which missing
bridges show up; they are likely to be pretty rare (that possibility has
already existed since Java 5, and it doesn't happen much, though will be
somewhat more likely with default methods.) A lot of things have to
align to make it happen; note that they don't happen when one interface
covariantly extends another, they require merging of two interface
methods from different maintenance domains.
What I'd like to do is to tag bridge methods with attributes identifying
which method is being bridged to which other method, so the VM can look
at the method attributes and say "Oh, this is just a 'symlink' to this
other method", and possibly optimize away the bridge stack frame if it
can. One of the worst things about the bridge implementation we got
stuck with in Java 5 is that they are basically opaque to the VM;
there's no semantics to ACC_BRIDGE and so the VM has to assume that the
method is ordinary bytecodes. Even if we don't act on that now, it
opens the door to future optimizations. For example, if your VM is
implemented so that vtable slots are populated by method handles, you
could insert bridgee.asType(bridge) directly into the vtable for the
bridge, rather than the trampoline generated by javac. But only if you
know the semantics, and for that, we need a classfile attribute.
On 5/2/2013 4:56 PM, Daniel Heidinga wrote:
> We're on board with this proposal.
>
> After working through the pattern catalog examples, there are some
> concerns around whether users will be able to detect when they are
> missing bridges. It is common, especially when building large
> applications, to take separately compiled libraries and use them without
> recompiling (think: no full app recompile when taking a security update
> of a particular library). There needs to be some tooling that can
> indicate that a bridge method is missing from an application.
>
> The java compiler (javac) must know how to determine if a bridge is
> required. Can we use javac's internals to analyze a classpath and
> determine if the classes on the classpath are missing bridges? Either a
> new bridge analyzer executable or a javac option (-analyzeBridges?) to
> do the detection would allow users to find potential issues and to
> understand their applications behaviour.
>
> Ending up in a world where the answer is "recompile again to be safe" is
> bad for the java ecosystem.
>
> --Dan
>
> Inactive hide details for Brian Goetz ---04/15/2013 01:16:38 PM---As you
> may recall, adding default methods requires that the VBrian Goetz
> ---04/15/2013 01:16:38 PM---As you may recall, adding default methods
> requires that the VM get involved in default method inher
>
>
> From:
>
>
> Brian Goetz <brian.goetz at oracle.com>
>
> To:
>
>
> "lambda-spec-experts at openjdk.java.net"
> <lambda-spec-experts at openjdk.java.net>
>
> Date:
>
>
> 04/15/2013 01:16 PM
>
> Subject:
>
>
> RI update: division of bridging responsibility between VM and compiler
>
> Sent by:
>
>
> lambda-spec-experts-bounces at openjdk.java.net
>
> ------------------------------------------------------------------------
>
>
>
> As you may recall, adding default methods requires that the VM get
> involved in default method inheritance, because it is an explicit goal
> for the addition of an interface method with a default to be a
> binary-compatible change. We've had an implementation of default
> inheritance in the VM for quite a while. The basic inheritance
> algorithm was really easy to implement; it built on top of existing
> vtable building in a straightforward and well-defined way.
>
> Some time back, we identified some cases where pushing default
> inheritance into the VM seemed to necessitate pushing bridge method
> generation into the VM as well. We also have had an implementation of
> this in the VM for a while too. But, this is a much bigger change and
> we're not as comfortable with it -- it pushes the details of the generic
> type system into the VM, and risks exposing Java-language-specific type
> system details to classes generated by other language compilers.
>
> At one point, we were convinced we had no choice. But since then, there
> were some simplifications in the definition of overriding with respect
> to defaults (specifically, outlawing abstract-default conflicts rather
> than silently merging them), and it turns out that this eliminates a
> number of the examples that led us to believe we had no choice in this
> matter. (Specifically, to land in a corner case, it now requires a
> bridge-requiring merge between a class and an interface; can't happen
> any more with two interfaces.) After having spent some time trying to
> specify what the invoke{virtual,interface,special} semantics might be in
> a VM-bridged world -- with the hopes that this would be step 1 along the
> path of eventually moving all bridging out of the static compiler (where
> it clearly does not belong, and is basically pure technical debt left
> over from generics) -- we're getting more comfortable with the corner
> cases that we'd have without VM bridging. Indeed, most of them are
> analogous to corner cases we already have today and would continue to
> have tomorrow under separate compilation with ordinary classes.
>
> Instead, we're now pursuing a path where we generate bridges into
> interfaces (since we can do that now) using an algorithm very similar to
> what we do with class bridges. We may need to extend the technique of
> compiler-generated bridges with generating additional classfile
> attributes that the VM might act on to avoid these anomalies, currently
> being explored.
>
> This offers a significant reduction in complexity. We can rip out all
> existing bridge-related code from VM, and do default inheritance using
> the simple "same erased signature" overriding the VM has always done.
> Can rip out all generic analysis, including verification of generic
> signatures. Though might have to add back processing of additional
> classfile attributes and potentially use those to modify the behavior of
> inheritance, details TBD. And, this keeps the generic type system in
> javac, eliminating risks of interference with other language inheritance
> semantics.
>
>
> BRIEF NOTATION BREAK
> --------------------
>
> When we were discussing how to specify default inheritance, we invented
> a notation where we wrote things like:
>
> Cc(Id(Ja))
>
> and wrote separate compilation examples as:
>
> Cc(Id(Ja)) -> Cc(Id(Jd))
>
> Which was much easier to reason about, and less ambiguity-prone, than
> writing the classes out longhand. Decoder chart:
>
> A, B: concrete or abstract classes
> C: concrete class to be instantiated
> I, J, K: interfaces
>
> In this world, like in FD, there's one method, named "m", with no
> arguments. Classes or interfaces have some extra letters after them to
> describe how m is declared:
>
> C -- no declaration of m
> Cc -- m() declared in C as concrete
> Ca, Ia -- m() declared in C or I as abstract
> Id -- m() declared in I as default
> Cm -- m() is declared in C as either abstract or concrete
>
> We now extend this notation with indicators describing covariant
> overrides, imagining a linear hierarchy of types T2 <: T1 <: T0:
>
> Cc0 -- m() declared in C as returning T0
> Cc1 -- m() declared in C as returning T1
>
> Supertypes are written in parentheses:
>
> Cc(Id(Jd))
>
> means that C extends I and and I extends J.
>
> Separate compilation is written as:
>
> Cc(Id(Ja)) -> Cc(Id(Jd))
>
> Since only J is changed, only J is assumed to be recompiled.
>
>
> MOTIVATING EXAMPLE
> ------------------
>
> Here's a problem we have today (and which the path we'd been pursuing
> would not have fixed for 8):
>
> Cc1(A) -> Cc1(Ac0)
>
> (This is a "contravariant underride.") This means we go from:
>
> abstract class A { }
> class C <: A { T1 m() { } }
>
> to
>
> abstract class A { T0 m() { } }
> class C <: A { T1 m() { } }
>
> without recompiling C.
>
> What will happen at runtime is:
>
> m()T1 -> C
> m()T0 -> A
>
> whereas with a global recompile, we would get:
>
> m()T1 -> C
> m()T0 -> C
>
> Note that:
> - This problem exists today and has existed since Java 5
> - Would get no better under the "default VM bridging" plan
> - No one seems particularly bothered by this long-standing issue.
>
>
> Now consider the defender analogue of this example:
>
> Cc1(I) -> Cc1(Id0)
>
> m()T1 -> C
> m()T0 -> I
>
> Is this any worse than the previous version? For default methods, we
> say "classes that don't override this method will get the default, which
> by definition meets the contract of I." A moldy class file that had no
> idea that it's m()T1 declaration was overriding an as-yet-unborn m()T0
> in a supertype could well be described as "not overriding the method."
> In which case they get the default. This does not seem so bad, or any
> worse than many other similar separate compilation scenarios today.
>
> Turning it around, if we handled this case but not the class-based
> version of the same issue, might that not even be weirder?
>
> Note also that with the decision to rule out abstract-default conflicts
> (i.e., outlawing K(Ia,Jd)), the set of possible bad cases is reduced a
> lot; many of the scary examples came from that space.
>
>
> INTERFACE BRIDGES
> -----------------
>
> We anticipate that (consistently compiled) interface hierarchies like
>
> Id1(Jd0)
>
> will be common. (Consider a method like Collection.immutable(), which
> might be covariantly overridden by List.immutable()). So, to support
> consistently compiled hierarchies like this (that is, I and J updated
> together) without forcing a recompile of concrete classes implementing
> I, the compiler could generate a bridge in I redirecting m()T0 to m()T1,
> with suitable cast, which is the highest point in the hierarchy where we
> can determine a bridge is needed. In a consistently compiled world,
> this is all that is needed.
>
> But we don't live in a consistently compiled world. So we must make
> some allowance for what might happen in a separately compiled world. The
> current scheme of only compiling bridges into the class where the
> bridgee lives helps reduce certain separate compilation artifacts. I
> think we should probably continue doing this, so that class bridges
> will, at times, override interface bridges. There does not seem to be
> harm in this, and it changes fewer things, and eliminates some risk vectors.
>
> (Ultimately the problem is that compiler bridges suffer from "premature
> bytecode". When the compiler generates a bridge, it is trying to reify
> the notion of "method m()T1 was known to override method m()T0 at
> compile time", but this is opaque to the VM, who can only slavishly
> propagate the bridge through subclass vtables as if it were code written
> by the user. If, instead of bridges (or in addition to), the compiler
> instead generated a class attribute of the form "I believe that m()T1
> overrides m()T0", the VM could act on that information directly, and
> this might buy us out of some of the worst possible problems.)
>
>
> WORST CASE SCENARIO
> -------------------
>
> The cases above are not terrible because the program continues to link
> after separation compilation and even does something vaguely
> justifiable. Here's a worse scenario (relevant humor break:
> http://www.youtube.com/watch?v=_W-qxpN2oEI).
>
> Cc1(Bc0(Ac0)) -> Cc1(Bc1(Ac0))
>
> If the implementation in C does:
>
> super.m()
>
> one gets a StackOverflowError. This happens because when we invoke
> C.m(), we are really invoking C.m()T1. C.m()T1 invokes B.m()T0 via
> invokespecial, thinking that it is invoking the parent implementation.
> But really B.m()T0 is a bridge for B.m()T1, that invokes B.m()T1 with
> invokevirtual. But B.m()T1 is overridden by C.m()T1, and so the
> invokevirtual is dispatched there. Which is where we started, so we
> ping-pong between C.m()T1 and B.m()T1 until we fall off the stack.
>
> Again, note that (a) we already have this problem since Java 5 and (b)
> the complex solution we were pursuing would not have fixed it for 8.
> But this is definitely worse than the problems above, and we want to not
> widen this hole.
>
> We need to explore further what kinds of separate compilation anomalies
> with bridges in interfaces might cause similar problems.
>
>
> EXHAUSTIVE PATTERN CATALOG
> --------------------------
>
> Dan did a nearly-exhaustive catalog of inheritance scenarios. The
> question is, do we find any of these anomalies so bad (worse than
> existing anomalies) that we cannot live with them? On review, none of
> them seem any worse than the pain of bridge methods under separate
> compilation we've been living with for years.
>
> They are annotated with what happens:
>
> 0: Description of the behavior of an invocation on an instance of C,
> targeting the descriptor of index 0.
> 0*: Behavior inconsistent with a full compilation of the final state.
>
> This following cases are not considered:
> - Illegal hierarchies, in either the initial or final state
> - Redundant extra classes/interfaces that have no effect on the outcome
> - Redundant permutations of 'implements' clauses
> - Final states that require recompiling C
>
> =====
> Linear inheritance (one ancestor, two methods)
>
> ---
>
> Cc1(A) -> Cc1(Ac0)
>
> 0*: Inherited from A
> 1: Declared in C
>
> ---
>
> Cc1(I) -> Cc1(Id0)
>
> 0*: Inherited default from I
> 1: Declared in C
>
> =====
> Linear inheritance (two ancestors, no method in C)
>
> ---
>
> C(Bc1(A)) -> C(Bc1(Ac0))
>
> 0*: Inherited from A
> 1: Inherited from B
>
> ---
>
> C(B(Ac0)) -> C(Bc1(Ac0))
>
> 0: Inherited bridge from B
> 1: Inherited from B
>
> ---
>
> C(B(A)) -> C(Bc1(Ac0))
>
> 0: Inherited bridge from B
> 1: Inherited from B
>
> ---
>
> C(Ac1(I)) -> C(Ac1(Id0))
>
> 0*: Inherited default from I
> 1: Inherited from A
>
> ---
>
> C(A(Id0)) -> C(Ac1(Id0))
>
> 0: Inherited bridge from A
> 1: Inherited from A
>
> ---
>
> C(A(I)) -> C(Ac1(Id0))
>
> 0: Inherited bridge from A
> 1: Inherited from A
>
> ---
>
> C(Id1(J)) -> C(Id1(Jd0))
>
> 0*: Inherited default from J
> 1: Inherited default from I
>
> ---
>
> C(I(Jd0)) -> C(Id1(Jd0))
>
> 0: Inherited bridge from I
> 1: Inherited default from I
>
> ---
>
> C(I(J)) -> C(Id1(Jd0))
>
> 0: Inherited bridge from I
> 1: Inherited default from I
>
> =====
> Linear inheritance (two ancestors, method in C)
>
> ---
>
> Cc2(B(Am0)) -> Cc2(Bc1(Am0))
>
> 0: Bridge in C
> 1*: Inherited from B
> 2: Declared in C
>
> ---
>
> Cc2(Bm1(A)) -> Cc2(Bm1(Ac0))
>
> 0*: Inherited from A
> 1: Bridge in C
> 2: Declared in C
>
> ---
>
> Cc2(B(A)) -> Cc2(Bc1(Ac0))
>
> 0*: Inherited bridge from B
> 1*: Inherited from B
> 2: Declared in C
>
> ---
>
> Cc2(A(Im0)) -> Cc2(Ac1(Im0))
>
> 0: Bridge in C
> 1*: Inherited from A
> 2: Declared in C
>
> ---
>
> Cc2(Am1(I)) -> Cc2(Am1(Id0))
>
> 0*: Inherited from I
> 1: Bridge in C
> 2: Declared in C
>
> ---
>
> Cc2(A(I)) -> Cc2(Ac1(Id0))
>
> 0*: Inherited bridge from A
> 1*: Inherited from A
> 2: Declared in C
>
> ---
>
> Cc2(J(Im0)) -> Cc2(Jd1(Im0))
>
> 0: Bridge in C
> 1*: Inherited default from J
> 2: Declared in C
>
> ---
>
> Cc2(Jm1(I)) -> Cc2(Jm1(Id0))
>
> 0*: Inherited default from I
> 1: Bridge in C
> 2: Declared in C
>
> ---
>
> Cc2(J(I)) -> Cc2(Jd1(Id0))
>
> 0*: Inherited bridge from J
> 1*: Inherited default from J
> 2: Declared in C
>
> =====
> Independent branches (no method in C)
>
> ---
>
> C(Ac1, I) -> C(Ac1, Id0)
>
> 0*: Inherited default from I
> 1: Inherited from A
>
> ---
>
> C(A, Id0) -> C(Ac1, Id0)
>
> 0*: Inherited default from I
> 1: Inherited from A
>
> ---
>
> C(A, I) -> C(Ac1, Id0)
>
> 0*: Inherited default from I
> 1: Inherited from A
>
> =====
> Independent branches (method in C)
>
> ---
>
> Cc2(Am0, I) -> Cc2(Am0, Id1)
>
> 0: Bridge in C
> 1*: Inherited default from I
> 2: Declared in C
>
> ---
>
> Cc2(A, Im1) -> Cc2(Ac0, Im1)
>
> 0*: Inherited from A
> 1: Bridge in C
> 2: Declared in C
>
> ---
>
> Cc2(A, I) -> Cc2(Ac0, Id1)
>
> 0*: Inherited from A
> 1*: Inherited default from I
> 2: Declared in C
>
> ---
>
> Cc2(Im0, J) -> Cc2(Im0, Jd1)
>
> 0: Bridge in C
> 1*: Inherited default from J
> 2: Declared in C
>
> ---
>
> Cc2(I, J) -> Cc2(Id0, Jd1)
>
> 0*: Inherited default from I
> 1*: Inherited default from J
> 2: Declared in C
>
> =====
> Diamond branches (no method in C)
>
> ---
>
> C(A(Id0), J(Id0)) -> C(Ac1(Id0), J(Id0))
>
> 0: Inherited bridge from A
> 1: Inherited from A
>
> ---
>
> C(A(Id0), J(Id0)) -> C(A(Id0), Jd1(Id0))
>
> 0: Inherited bridge from J
> 1: Inherited default from J
>
> ---
>
> C(A(Id0), J(Id0)) -> C(Ac2(Id0), Jd1(Id0))
>
> 0: Inherited bridge from A (beats new bridge in J)
> 1*: Inherited default from J
> 2: Inherited from A
>
> ---
>
> C(J(Id0), K(Id0)) -> C(Jd1(Id0), K(Id0))
>
> 0: Inherited bridge from J
> 1: Inherited default from J
>
> ---
>
> C(Ac2(Im0), J(Im0)) -> C(Ac2(Im0), Jd1(Im0))
>
> 0: Inherited bridge from A (beats new bridge in J)
> 1*: Inherited default from J
> 2: Inherited from A
>
> ---
>
> C(A(Im0), Jd1(Im0)) -> C(Ac2(Im0), Jd1(Im0))
>
> 0: Inherited bridge from A (beats old bridge in J)
> 1*: Inherited default from J
> 2: Inherited from A
>
> =====
> Diamond branches (method in C)
>
> ---
>
> Cc2(A(Im0), J(Im0)) -> Cc2(Ac1(Im0), J(Im0))
>
> 0: Bridge in C
> 1*: Inherited from A
> 2: Declared in C
>
> ---
>
> Cc2(A(Im0), J(Im0)) -> Cc2(A(Im0), Jd1(Im0))
>
> 0: Bridge in C
> 1*: Inherited default from J
> 2: Declared in C
>
> ---
>
> Cc3(A(Im0), J(Im0)) -> Cc3(Ac1(Im0), Jd2(Im0))
>
> 0: Bridge in C
> 1*: Inherited from A
> 2*: Inherited default from J
> 3: Declared in C
>
> ---
>
> Cc2(J(Im0), K(Im0)) -> Cc2(Jd1(Im0), K(Im0))
>
> 0: Bridge in C
> 1*: Inherited default from J
> 2: Declared in C
>
> ---
>
> Cc3(J(Im0), K(Im0)) -> Cc3(Jd1(Im0), Kd2(Im0))
>
> 0: Bridge in C
> 1*: Inherited default from J
> 2*: Inherited default from K
> 3: Declared in C
>
> ---
>
> Cc3(Am1(Im0), J(Im0)) -> Cc3(Am1(Im0), Jd2(Im0))
>
> 0: Bridge in C
> 1: Bridge in C
> 2*: Inherited default from J
> 3: Declared in C
>
> ---
>
> Cc3(A(Im0), Jm2(Im0)) -> Cc3(Ac1(Im0), Jm2(Im0))
>
> 0: Bridge in C
> 1*: Inherited from A
> 2: Bridge in C
> 3: Declared in C
>
> ---
>
> Cc3(Jm1(Im0), K(Im0)) -> Cc3(Jm1(Im0), Kd2(Im0))
>
> 0: Bridge in C
> 1: Bridge in C
> 2*: Inherited default from K
> 3: Declared in C
>
>
>
More information about the lambda-spec-experts
mailing list