RI update: division of bridging responsibility between VM and compiler

Remi Forax forax at univ-mlv.fr
Wed May 8 11:39:46 PDT 2013


Great if we can avoid the VM to have to be too tied to the Java 
semantics  (especially the generics part).

I have a question, do we have an idea of the bridge methods that need to 
be included in java.util classes becaue we have added default methods on 
several interfaces ?

As I currently understand the problem, I don't think there will be much 
trouble for code that use the JDK.
But for code like Guava collection that uses JDK interfaces and are used 
in application code, it will be more complex.
That said I don't know if Guava introduces new interfaces that use 
covariant return type or bounded generics.

other comments inlined.

On 04/15/2013 06:52 PM, Brian Goetz wrote:
> As you may recall, adding default methods requires that the VM get 
> involved in default method inheritance, because it is an explicit goal 
> for the addition of an interface method with a default to be a 
> binary-compatible change.  We've had an implementation of default 
> inheritance in the VM for quite a while.  The basic inheritance 
> algorithm was really easy to implement; it built on top of existing 
> vtable building in a straightforward and well-defined way.
>
> Some time back, we identified some cases where pushing default 
> inheritance into the VM seemed to necessitate pushing bridge method 
> generation into the VM as well.  We also have had an implementation of 
> this in the VM for a while too.  But, this is a much bigger change and 
> we're not as comfortable with it -- it pushes the details of the 
> generic type system into the VM, and risks exposing 
> Java-language-specific type system details to classes generated by 
> other language compilers.
>
> At one point, we were convinced we had no choice.  But since then, 
> there were some simplifications in the definition of overriding with 
> respect to defaults (specifically, outlawing abstract-default 
> conflicts rather than silently merging them), and it turns out that 
> this eliminates a number of the examples that led us to believe we had 
> no choice in this matter.  (Specifically, to land in a corner case, it 
> now requires a bridge-requiring merge between a class and an 
> interface; can't happen any more with two interfaces.)  After having 
> spent some time trying to specify what the 
> invoke{virtual,interface,special} semantics might be in a VM-bridged 
> world -- with the hopes that this would be step 1 along the path of 
> eventually moving all bridging out of the static compiler (where it 
> clearly does not belong, and is basically pure technical debt left 
> over from generics) -- we're getting more comfortable with the corner 
> cases that we'd have without VM bridging. Indeed, most of them are 
> analogous to corner cases we already have today and would continue to 
> have tomorrow under separate compilation with ordinary classes.
>
> Instead, we're now pursuing a path where we generate bridges into 
> interfaces (since we can do that now) using an algorithm very similar 
> to what we do with class bridges.  We may need to extend the technique 
> of compiler-generated bridges with generating additional classfile 
> attributes that the VM might act on to avoid these anomalies, 
> currently being explored.

I don't understand the last sentence ?
The bridges are needed by the VM, so what's the point of having the VM 
ignore them ?

>
> This offers a significant reduction in complexity.  We can rip out all 
> existing bridge-related code from VM, and do default inheritance using 
> the simple "same erased signature" overriding the VM has always done. 
> Can rip out all generic analysis, including verification of generic 
> signatures.  Though might have to add back processing of additional 
> classfile attributes and potentially use those to modify the behavior 
> of inheritance, details TBD.  And, this keeps the generic type system 
> in javac, eliminating risks of interference with other language 
> inheritance semantics.
>
>
> BRIEF NOTATION BREAK
> --------------------
>
> When we were discussing how to specify default inheritance, we 
> invented a notation where we wrote things like:
>
>   Cc(Id(Ja))
>
> and wrote separate compilation examples as:
>
>   Cc(Id(Ja)) -> Cc(Id(Jd))
>
> Which was much easier to reason about, and less ambiguity-prone, than 
> writing the classes out longhand.  Decoder chart:
>
> A, B: concrete or abstract classes
> C: concrete class to be instantiated
> I, J, K: interfaces
>
> In this world, like in FD, there's one method, named "m", with no 
> arguments.  Classes or interfaces have some extra letters after them 
> to describe how m is declared:
>
> C -- no declaration of m
> Cc -- m() declared in C as concrete
> Ca, Ia -- m() declared in C or I as abstract
> Id -- m() declared in I as default
> Cm -- m() is declared in C as either abstract or concrete
>
> We now extend this notation with indicators describing covariant 
> overrides, imagining a linear hierarchy of types T2 <: T1 <: T0:
>
> Cc0 -- m() declared in C as returning T0
> Cc1 -- m() declared in C as returning T1
>
> Supertypes are written in parentheses:
>
> Cc(Id(Jd))
>
> means that C extends I and and I extends J.
>
> Separate compilation is written as:
>
> Cc(Id(Ja)) -> Cc(Id(Jd))
>
> Since only J is changed, only J is assumed to be recompiled.
>
>
> MOTIVATING EXAMPLE
> ------------------
>
> Here's a problem we have today (and which the path we'd been pursuing 
> would not have fixed for 8):
>
> Cc1(A) -> Cc1(Ac0)
>
> (This is a "contravariant underride.")  This means we go from:
>
> abstract class A { }
> class C <: A { T1 m() { } }
>
> to
>
> abstract class A { T0 m() { } }
> class C <: A { T1 m() { } }
>
> without recompiling C.
>
> What will happen at runtime is:
>
> m()T1 -> C
> m()T0 -> A
>
> whereas with a global recompile, we would get:
>
> m()T1 -> C
> m()T0 -> C
>
> Note that:
>  - This problem exists today and has existed since Java 5
>  - Would get no better under the "default VM bridging" plan
>  - No one seems particularly bothered by this long-standing issue.
>
>
> Now consider the defender analogue of this example:
>
> Cc1(I) -> Cc1(Id0)
>
> m()T1 -> C
> m()T0 -> I
>
> Is this any worse than the previous version?  For default methods, we 
> say "classes that don't override this method will get the default, 
> which by definition meets the contract of I."  A moldy class file that 
> had no idea that it's m()T1 declaration was overriding an 
> as-yet-unborn m()T0 in a supertype could well be described as "not 
> overriding the method." In which case they get the default.  This does 
> not seem so bad, or any worse than many other similar separate 
> compilation scenarios today.
>
> Turning it around, if we handled this case but not the class-based 
> version of the same issue, might that not even be weirder?
>
> Note also that with the decision to rule out abstract-default 
> conflicts (i.e., outlawing K(Ia,Jd)), the set of possible bad cases is 
> reduced a lot; many of the scary examples came from that space.
>
>
> INTERFACE BRIDGES
> -----------------
>
> We anticipate that (consistently compiled) interface hierarchies like
>
>   Id1(Jd0)
>
> will be common.  (Consider a method like Collection.immutable(), which 
> might be covariantly overridden by List.immutable()).  So, to support 
> consistently compiled hierarchies like this (that is, I and J updated 
> together) without forcing a recompile of concrete classes implementing 
> I, the compiler could generate a bridge in I redirecting m()T0 to 
> m()T1, with suitable cast, which is the highest point in the hierarchy 
> where we can determine a bridge is needed.  In a consistently compiled 
> world, this is all that is needed.
>
> But we don't live in a consistently compiled world.  So we must make 
> some allowance for what might happen in a separately compiled world. 
> The current scheme of only compiling bridges into the class where the 
> bridgee lives helps reduce certain separate compilation artifacts.  I 
> think we should probably continue doing this, so that class bridges 
> will, at times, override interface bridges. There does not seem to be 
> harm in this, and it changes fewer things, and eliminates some risk 
> vectors.

I agree,

>
> (Ultimately the problem is that compiler bridges suffer from 
> "premature bytecode".  When the compiler generates a bridge, it is 
> trying to reify the notion of "method m()T1 was known to override 
> method m()T0 at compile time", but this is opaque to the VM, who can 
> only slavishly propagate the bridge through subclass vtables as if it 
> were code written by the user.  If, instead of bridges (or in addition 
> to), the compiler instead generated a class attribute of the form "I 
> believe that m()T1 overrides m()T0", the VM could act on that 
> information directly, and this might buy us out of some of the worst 
> possible problems.)
>
>
> WORST CASE SCENARIO
> -------------------
>
> The cases above are not terrible because the program continues to link 
> after separation compilation and even does something vaguely 
> justifiable.  Here's a worse scenario (relevant humor break: 
> http://www.youtube.com/watch?v=_W-qxpN2oEI).
>
>   Cc1(Bc0(Ac0)) -> Cc1(Bc1(Ac0))
>
> If the implementation in C does:
>
>   super.m()
>
> one gets a StackOverflowError.  This happens because when we invoke 
> C.m(), we are really invoking C.m()T1.  C.m()T1 invokes B.m()T0 via 
> invokespecial, thinking that it is invoking the parent implementation. 
> But really B.m()T0 is a bridge for B.m()T1, that invokes B.m()T1 with 
> invokevirtual.  But B.m()T1 is overridden by C.m()T1, and so the 
> invokevirtual is dispatched there.  Which is where we started, so we 
> ping-pong between C.m()T1 and B.m()T1 until we fall off the stack.
>
> Again, note that (a) we already have this problem since Java 5 and (b) 
> the complex solution we were pursuing would not have fixed it for 8. 
> But this is definitely worse than the problems above, and we want to 
> not widen this hole.
>
> We need to explore further what kinds of separate compilation 
> anomalies with bridges in interfaces might cause similar problems.

I think the hole is still a little wider because with the actual 
problem, if you detect the problem you can just don't use the abstract 
class and it works. With default method in interfaces, there is no way 
to have this workaround.

Rémi

>
>
> EXHAUSTIVE PATTERN CATALOG
> --------------------------
>
> Dan did a nearly-exhaustive catalog of inheritance scenarios.  The 
> question is, do we find any of these anomalies so bad (worse than 
> existing anomalies) that we cannot live with them?  On review, none of 
> them seem any worse than the pain of bridge methods under separate 
> compilation we've been living with for years.
>
> They are annotated with what happens:
>
> 0: Description of the behavior of an invocation on an instance of C, 
> targeting the descriptor of index 0.
> 0*: Behavior inconsistent with a full compilation of the final state.
>
> This following cases are not considered:
> - Illegal hierarchies, in either the initial or final state
> - Redundant extra classes/interfaces that have no effect on the outcome
> - Redundant permutations of 'implements' clauses
> - Final states that require recompiling C
>
> =====
> Linear inheritance (one ancestor, two methods)
>
> ---
>
> Cc1(A) -> Cc1(Ac0)
>
> 0*: Inherited from A
> 1: Declared in C
>
> ---
>
> Cc1(I) -> Cc1(Id0)
>
> 0*: Inherited default from I
> 1: Declared in C
>
> =====
> Linear inheritance (two ancestors, no method in C)
>
> ---
>
> C(Bc1(A)) -> C(Bc1(Ac0))
>
> 0*: Inherited from A
> 1: Inherited from B
>
> ---
>
> C(B(Ac0)) -> C(Bc1(Ac0))
>
> 0: Inherited bridge from B
> 1: Inherited from B
>
> ---
>
> C(B(A)) -> C(Bc1(Ac0))
>
> 0: Inherited bridge from B
> 1: Inherited from B
>
> ---
>
> C(Ac1(I)) -> C(Ac1(Id0))
>
> 0*: Inherited default from I
> 1: Inherited from A
>
> ---
>
> C(A(Id0)) -> C(Ac1(Id0))
>
> 0: Inherited bridge from A
> 1: Inherited from A
>
> ---
>
> C(A(I)) -> C(Ac1(Id0))
>
> 0: Inherited bridge from A
> 1: Inherited from A
>
> ---
>
> C(Id1(J)) -> C(Id1(Jd0))
>
> 0*: Inherited default from J
> 1: Inherited default from I
>
> ---
>
> C(I(Jd0)) -> C(Id1(Jd0))
>
> 0: Inherited bridge from I
> 1: Inherited default from I
>
> ---
>
> C(I(J)) -> C(Id1(Jd0))
>
> 0: Inherited bridge from I
> 1: Inherited default from I
>
> =====
> Linear inheritance (two ancestors, method in C)
>
> ---
>
> Cc2(B(Am0)) -> Cc2(Bc1(Am0))
>
> 0: Bridge in C
> 1*: Inherited from B
> 2: Declared in C
>
> ---
>
> Cc2(Bm1(A)) -> Cc2(Bm1(Ac0))
>
> 0*: Inherited from A
> 1: Bridge in C
> 2: Declared in C
>
> ---
>
> Cc2(B(A)) -> Cc2(Bc1(Ac0))
>
> 0*: Inherited bridge from B
> 1*: Inherited from B
> 2: Declared in C
>
> ---
>
> Cc2(A(Im0)) -> Cc2(Ac1(Im0))
>
> 0: Bridge in C
> 1*: Inherited from A
> 2: Declared in C
>
> ---
>
> Cc2(Am1(I)) -> Cc2(Am1(Id0))
>
> 0*: Inherited from I
> 1: Bridge in C
> 2: Declared in C
>
> ---
>
> Cc2(A(I)) -> Cc2(Ac1(Id0))
>
> 0*: Inherited bridge from A
> 1*: Inherited from A
> 2: Declared in C
>
> ---
>
> Cc2(J(Im0)) -> Cc2(Jd1(Im0))
>
> 0: Bridge in C
> 1*: Inherited default from J
> 2: Declared in C
>
> ---
>
> Cc2(Jm1(I)) -> Cc2(Jm1(Id0))
>
> 0*: Inherited default from I
> 1: Bridge in C
> 2: Declared in C
>
> ---
>
> Cc2(J(I)) -> Cc2(Jd1(Id0))
>
> 0*: Inherited bridge from J
> 1*: Inherited default from J
> 2: Declared in C
>
> =====
> Independent branches (no method in C)
>
> ---
>
> C(Ac1, I) -> C(Ac1, Id0)
>
> 0*: Inherited default from I
> 1: Inherited from A
>
> ---
>
> C(A, Id0) -> C(Ac1, Id0)
>
> 0*: Inherited default from I
> 1: Inherited from A
>
> ---
>
> C(A, I) -> C(Ac1, Id0)
>
> 0*: Inherited default from I
> 1: Inherited from A
>
> =====
> Independent branches (method in C)
>
> ---
>
> Cc2(Am0, I) -> Cc2(Am0, Id1)
>
> 0: Bridge in C
> 1*: Inherited default from I
> 2: Declared in C
>
> ---
>
> Cc2(A, Im1) -> Cc2(Ac0, Im1)
>
> 0*: Inherited from A
> 1: Bridge in C
> 2: Declared in C
>
> ---
>
> Cc2(A, I) -> Cc2(Ac0, Id1)
>
> 0*: Inherited from A
> 1*: Inherited default from I
> 2: Declared in C
>
> ---
>
> Cc2(Im0, J) -> Cc2(Im0, Jd1)
>
> 0: Bridge in C
> 1*: Inherited default from J
> 2: Declared in C
>
> ---
>
> Cc2(I, J) -> Cc2(Id0, Jd1)
>
> 0*: Inherited default from I
> 1*: Inherited default from J
> 2: Declared in C
>
> =====
> Diamond branches (no method in C)
>
> ---
>
> C(A(Id0), J(Id0)) -> C(Ac1(Id0), J(Id0))
>
> 0: Inherited bridge from A
> 1: Inherited from A
>
> ---
>
> C(A(Id0), J(Id0)) -> C(A(Id0), Jd1(Id0))
>
> 0: Inherited bridge from J
> 1: Inherited default from J
>
> ---
>
> C(A(Id0), J(Id0)) -> C(Ac2(Id0), Jd1(Id0))
>
> 0: Inherited bridge from A (beats new bridge in J)
> 1*: Inherited default from J
> 2: Inherited from A
>
> ---
>
> C(J(Id0), K(Id0)) -> C(Jd1(Id0), K(Id0))
>
> 0: Inherited bridge from J
> 1: Inherited default from J
>
> ---
>
> C(Ac2(Im0), J(Im0)) -> C(Ac2(Im0), Jd1(Im0))
>
> 0: Inherited bridge from A (beats new bridge in J)
> 1*: Inherited default from J
> 2: Inherited from A
>
> ---
>
> C(A(Im0), Jd1(Im0)) -> C(Ac2(Im0), Jd1(Im0))
>
> 0: Inherited bridge from A (beats old bridge in J)
> 1*: Inherited default from J
> 2: Inherited from A
>
> =====
> Diamond branches (method in C)
>
> ---
>
> Cc2(A(Im0), J(Im0)) -> Cc2(Ac1(Im0), J(Im0))
>
> 0: Bridge in C
> 1*: Inherited from A
> 2: Declared in C
>
> ---
>
> Cc2(A(Im0), J(Im0)) -> Cc2(A(Im0), Jd1(Im0))
>
> 0: Bridge in C
> 1*: Inherited default from J
> 2: Declared in C
>
> ---
>
> Cc3(A(Im0), J(Im0)) -> Cc3(Ac1(Im0), Jd2(Im0))
>
> 0: Bridge in C
> 1*: Inherited from A
> 2*: Inherited default from J
> 3: Declared in C
>
> ---
>
> Cc2(J(Im0), K(Im0)) -> Cc2(Jd1(Im0), K(Im0))
>
> 0: Bridge in C
> 1*: Inherited default from J
> 2: Declared in C
>
> ---
>
> Cc3(J(Im0), K(Im0)) -> Cc3(Jd1(Im0), Kd2(Im0))
>
> 0: Bridge in C
> 1*: Inherited default from J
> 2*: Inherited default from K
> 3: Declared in C
>
> ---
>
> Cc3(Am1(Im0), J(Im0)) -> Cc3(Am1(Im0), Jd2(Im0))
>
> 0: Bridge in C
> 1: Bridge in C
> 2*: Inherited default from J
> 3: Declared in C
>
> ---
>
> Cc3(A(Im0), Jm2(Im0)) -> Cc3(Ac1(Im0), Jm2(Im0))
>
> 0: Bridge in C
> 1*: Inherited from A
> 2: Bridge in C
> 3: Declared in C
>
> ---
>
> Cc3(Jm1(Im0), K(Im0)) -> Cc3(Jm1(Im0), Kd2(Im0))
>
> 0: Bridge in C
> 1: Bridge in C
> 2*: Inherited default from K
> 3: Declared in C



More information about the lambda-spec-observers mailing list