Updated VM-bridges document

Brian Goetz brian.goetz at oracle.com
Mon Apr 8 01:18:43 UTC 2019


That’s just a typo.  There’s just one attribute.  


> On Apr 7, 2019, at 5:30 PM, Remi Forax <forax at univ-mlv.fr> wrote:
> 
> Hi Brian,
> with an hat of JVM language implementor,
> the attribute ForwardingBridge is important because it allows to simulate things that are not co/contravariant for the VM but are from the language point of view, like between an int and a Long.
> 
> I don't see the point of having two attributes, given that ForwardingBridge supports a superset of what Forwarding support, it seems to be a premature optimization to me.
> 
> The other thing is that Forwarding bridge should not use an adapter but a bootstrap method.
> Historically, indy was using something similar to your adapter but this design doesn't support sharing code because here the adapter code has to be present in the same class as the forwarding method.
> If you want to share code, the shared code may want to access to the class containing the forwarding method, that's why you need a Lookup object and the share code may want to have some specific arguments, aka the bootstrap arguments.
> 
> I think John has already proposed that we should support a bootstrap method that returns a MethodHandle instead of a CallSite, this is exactly what we need here.
> 
> regards,
> Rémi
> 
> ----- Mail original -----
>> De: "Brian Goetz" <brian.goetz at oracle.com>
>> À: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
>> Envoyé: Jeudi 4 Avril 2019 14:33:39
>> Objet: Updated VM-bridges document
> 
>> At the BUR meeting, we discussed reshuffling the dependency graph to do
>> forwarding+reversing bridges earlier, which has the effect of taking some
>> pressure off of the descriptor language.  Here’s an updated doc on
>> forwarding-reversing bridges in the VM.
>> 
>> I’ve dropped, for the time being, any discussion of replacing existing generic
>> bridges with this mechanism; we can revisit that later if it makes sense.
>> Instead, I’ve focused solely on the migration aspects.  I’ve also dropped any
>> mention of implementation strategy, and instead appealed to “as if” behavior.
>> 
>> 
>> ## From Bridges to Forwarders
>> 
>> In the Java 1.0 days, `javac` was little more than an "assembler" for
>> the classfile format, translating source code to bytecode in a mostly
>> 1:1 manner.  And, we liked it that way; the more predictable the
>> translation scheme, the more effective the runtime optimizations.
>> Even the major upgrade of Java 5 didn't significantly affect the
>> transparency of the resulting classfiles.
>> 
>> Over time, we've seen small divergences between the language model and
>> the classfile model, and each of these is a source of sharp edges.  In
>> Java 1.1 the addition of inner classes, and the mismatch between the
>> accessibility model in the language and the JVM (the language treated
>> a nest as a single entity; the JVM treat nest members as separate
>> classes) required _access bridges_ (`access$000` methods), which have
>> been the source of various issues over the years.  Twenty years later,
>> these methods were obviated by [_Nest-based Access Control_][jep181]
>> -- which represents the choice to align the VM model to the language
>> model, so these adaptation artifacts are no longer required.
>> 
>> In Java 5, while we were able to keep the translation largely stable
>> and transparent through the use of erasure, there was one point of
>> misalignment; several situations (covariant overrides, instantiated
>> generic supertypes) could give rise to the situation where two or more
>> method descriptors -- which the JVM treats as distinct methods -- are
>> treated by the language as if they correspond to the same method.  To
>> fool the VM, the compiler emits _bridge methods_ which forward
>> invocations from one signature to another.  And, as often happens when
>> we try to fool the VM, it ultimately has its revenge.
>> 
>> #### Example: covariant overrides
>> 
>> Java 5 introduced the ability to override a method but to provide a
>> more specific return type.  (Java 8 later extended this to bridges in
>> interfaces as well.)  For example:
>> 
>> ```{.java}
>> class Parent {
>>   Object m() { ... }
>> }
>> 
>> class Child extends Parent {
>>   @Override
>>   String m() { ... }
>> }
>> ```
>> 
>> `Parent` declares a method whose descriptor is `()Object`, and `Child`
>> declares a method with the same name whose descriptor is `()String`.
>> If we compiled this class in the obvious way, the method in `Child`
>> would not override the method in `Parent`, and anyone calling
>> `Parent.m()` would find themselves executing the wrong implementation.
>> 
>> The compiler addresses this by providing an additional implementation
>> of `m()`, whose descriptor is `()Object` (an actual override), marked
>> with `ACC_SYNTHETIC` and `ACC_BRIDGE`, whose body invokes `m()String`
>> (with `invokevirtual`), redirecting calls to the right implementation.
>> 
>> #### Example: generic substitution
>> 
>> A similar situation arises when we have a generic substitution with a
>> superclass.  For example:
>> 
>> ```{.java}
>> interface Parent<T> {
>>   void m(T x);
>> }
>> 
>> class Child extends Parent<String> {
>>   @Override
>>   void m(String x) { ... }
>> }
>> ```
>> 
>> At the language level, it is clear that `Child::m` intends to override
>> `Parent::m`.  But the descriptor of `Parent::m` is `(Object)V`, and
>> the descriptor of `Child::m` is `(String)V`, so again a bridge is
>> needed.
>> 
>> Because the two signatures -- `m(Object)V` and `m(String)V` -- have
>> been "merged" in this manner, the compiler will prevent subclasses
>> from overriding the bridge signature, in order to maintain the
>> integrity of the bridging scheme.  (The first time you encounter  an
>> error message informing you of an illegal override in this situation,
>> it can be extremely confusing!)
>> 
>> #### Anatomy of a bridge method
>> 
>> The bridge methods that are generated by the compiler today operate by
>> _forwarding_.  That is, a bridge method `m(X)` is always defined
>> relative to some other method `m(Y)`, and the body of a bridge method
>> pushes its arguments on the stack, adapting them (widening, casting,
>> boxing, etc) the arguments from X to Y, invoking `m(Y)` with
>> `invokevirtual`, and adapting the return type from Y to X, and
>> returning that.  Because the bridge uses `invokevirtual`, it need only
>> be generated once, and invocations of the bridge may select a method
>> in a subclass.  (The bridge is generated at the "highest" place in the
>> inheritance hierarchy where the need for a bridge is identified,
>> which may be a class or an interface.)
>> 
>> #### Bridges are brittle
>> 
>> Bridges can be brittle under separate compilation (and, there was a
>> nontrivial bug tail initially.)  Separate compilation can move bridges
>> from where already-compiled code expects them to be to places it does
>> not expect them.  This can cause the wrong method body to be invoked,
>> or can cause "bridge loops" (resulting in `StackOverflowError`).
>> (These anomalies disappear if the entire hierarchy is consistently
>> recompiled; they are solely an artifact of inconsistent separate
>> compilation.)
>> 
>> The basic problem with bridge methods is that the language views the
>> two method descriptors as two faces of the same actual method, whereas
>> the JVM sees them as distinct methods.  (And, reflection also has to
>> participate in the charade.)
>> 
>> #### Limits of bridge methods
>> 
>> Bridge methods have worked well enough for the uses to which we've put
>> them, but there are a number of desirable scenarios where bridge
>> methods ultimately run out of gas.  These scenarios stem from various
>> forms of _migration_, and the desire to make these migrations
>> binary-compatible.
>> 
>> The problem of migration arises both from language evolution (Valhalla
>> aims to enable compatible migrating from value-based classes to value
>> types, and from erased generics to specialized), as well as from the
>> ordinary evolution of libraries.
>> 
>> An example of the "ordinary migration" problem is the replacement of
>> the old `Date` classes with `LocalDateTime` and friends.  We can
>> easily add new the classes to the JDK, along with conversions to and
>> from the old types, but there are existing APIs that still deal in
>> `Date` -- and if we ever want to be able to deprecate the old
>> versions, we have to find a way to compatibly migrate APIs that deal
>> in `Date` to the new types.  (The extreme form of this is the
>> "Collections 2.0" problem; we could surely write a new Collections
>> library, but when nearly every API deals in `List`, unless we can
>> migrate these away, what would be the point?)
>> 
>> Migration scenarios like these pose two problems that bridge methods
>> cannot solve:
>> 
>> - **Fields.**  While we can often reroute method invocations with
>>   bridges, we have no similar mechanism for fields.  If a field
>>   signature changes (whether due to changes in the translation
>>   strategy, or changes in the API), there is no way to make this
>>   binary-compatible.
>> - **Overrides.**  Bridges allow us to reroute _invocations_ of
>>   methods, but not _overrides_ of methods.  If a method descriptor
>>   in a non-final class changes, but has subclasses in a separate
>>   maintenance domain that continue to use the old descriptor, what
>>   is intended to be an override may accidentally become an overload,
>>   or might override the bridge instead of the actual method.
>> 
>> #### Wildcards and polymorphic fields
>> 
>> A non-migration application for bridges that comes out of Valhalla is
>> _wildcards_.  For a class `C<T>` with a method `m(T)`, the wildcard
>> `C<?>` (the class type) has an abstract method `m(Object)`, which
>> needs to be implemented by each species type.  This is, effectively, a
>> bridge; the method `m(Object)` generated for the species adapts the
>> arguments and forwards to the "real" (`m(T)`) method.  While this
>> could be implemented using straightforward code generation in the
>> static compiler, it may be preferable to treat this as a bridge as
>> well.
>> 
>> More importantly, the same is true for fields; if `C<T>` has a field
>> of type `T`, then the wildcard `C<?>` will expose this field as if it
>> were of type `Object`.  This cannot be implemented using
>> straightforward code generation in the static compiler (without
>> undermining the promise of migration compatibility.)
>> 
>> ## Forwarding
>> 
>> In this document, we attempt to learn from the history of bridges, and
>> create a new mechanism -- _forwarders_ -- that work with the JVM
>> instead of against it.  This raises the level of expressivity of
>> classfiles and opens the possibility of greater laziness.  It is
>> possible that traditional bridging scenarios can eventually be handled
>> by forwarders too, but for purposes of this document, we will focus
>> exclusively on the migration scenarios.
>> 
>> A _forwarder_ is a non-abstract method that, instead of a `Code`
>> attribute, has a `Forwarding` attribute:
>> 
>> ```
>> Forwarding {
>>   u2 name;
>>   u4 length;
>>   u2 forwardeeDescriptor;
>> }
>> ```
>> 
>> Let's assume that forwarders have the `ACC_FORWARDER` and
>> `ACC_SYNTHETIC` bits (in reality we will likely overload
>> `ACC_BRIDGE`).
>> 
>> When compiling a method (concrete or abstract) that has been migrated
>> from an old descriptor to a new descriptor (such as migrating
>> `m(Object)V` to `m(String)V`), the compiler would generate an ordinary
>> method with the new descriptor, and a forwarder with the old
>> descriptor which forwarders to the new descriptor.  This captures the
>> statement that there used to be a method called `m` with the old
>> descriptor, but it migrated to the new descriptor -- so that the JVM
>> can transparently adjust the behavior of clients and overriders that
>> were not aware of the migration.
>> 
>> #### Invocation of forwarders
>> 
>> Given a forwarder in a class with name `N` and descriptor `D` that
>> forwards to descriptor `E`, define `M` by:
>> 
>>   MethodHandle M = MethodHandles.lookup()
>>                                 .findVirtual(thisClass, N, E);
>> 
>> If the forwarder is _selected_ as the target of an `invokevirtual`,
>> the behavior should be _as if_ the caller invoked `M.asType(D)`, where
>> the arguments of `D` are adapted to their counterparts in `E`, and the
>> return type in `E` is adapted back to the return type in `D`.  (We may
>> wish to reduce the set of built-in adaptations to a smaller set than
>> those implemented by `MethodHandle::asType`, for simplicity, based on
>> requirements.)
>> 
>> Because forwarders exist for migration, we hope that over time,
>> callers will migrate from the old descriptor to the new, rendering
>> forwarders vestigial.  As a result, we may wish to defer as much of
>> the bridge generation logic as possible to first-selection time.
>> 
>> #### Forwarders for fields
>> 
>> The forwarding strategy can be applied to fields as well.  In this
>> case, the forwardee descriptor is that of a field descriptor, and the
>> behavior has the same semantics as adapting a target field accessor
>> method handle to the type of the bridge descriptor.  (If the forwarder
>> field is static, then the field should be static too.)
>> 
>> #### Overriding of forwarders
>> 
>> Capturing forwarding information declaratively enables us to detect
>> when a class overrides a forwarder descriptor with a non-forwarder
>> (which indicates that the subclass is out of date with its supertypes)
>> and redirect the override to the actual method (with arguments and
>> return values adapted.)
>> 
>> Given a forwarder in a class `A` with name `N` and descriptor `D` that
>> forwards to descriptor `E`, suppose a subclass `B` overrides the
>> forwarder with `N(D)`.  Let `M` be the method handle that corresponds
>> to the `Code` attribute of `B.N(D)`.  We would like it to behave as if
>> `B` had instead specified a method `N(E)`, whose `Code` attribute
>> corresponded to `M.asType(E)`.
>> 
>> #### Additional adaptations
>> 
>> The uses we anticipate for L100 all can be done with `asType()`
>> adaptations (in fact, with a subset of `asType()` adaptations).
>> However, if we wish to support user-provided migrations (such as
>> migrating libraries that use `Date` to `LocalDateTime`) or migrate
>> complex JDK APIs such as `Stream`, we may need to provide additional
>> adaptation logic in the `ForwardingBridge` attribute.  Let's extend
>> the `Forwarding` attribute:
>> 
>> ```
>> Forwarding {
>>   u2 name;
>>   u4 length;
>>   u2 forwardeeDescriptor;
>>   u2 adapter;
>> }
>> ```
> 
> it's ForwardingBridge here ?
> 
>> 
>> where `adaptor` is the constant pool index of a method handle whose
>> type is `(MethodHandle;MethodType;)MethodHandle;` (note that the
>> method handle for `MethodHandle::asType` has this shape).  If
>> `adapter` is zero, we use the built-in adaptations; if it is nonzero,
>> we use the referred-to method handle to adapt between the forwarder
>> and forwardee descriptors (in both directions).
>> 
>> #### Adaptation failures and limitations
>> 
>> Whatever adaptations we are prepared to do between forwarder and
>> forwardee, we need to be prepared to do them in both directions; if a
>> method `m(int)` is migrated to `m(long)`, invocation arguments will be
>> adapted `int` to `long`, but if overridden, we'll do the reverse
>> adaptation on the (out of date) overrider `m(int)`.  Given that most
>> adaptations are not between isomorphic domains, there will be cases in
>> one direction or the other that cannot be represented  (`long` to
>> `int` is lossy; `Integer` to `int` can NPE; `Object` to  `String` can
>> CCE.)
>> 
>> Our guidance is that adaptations should form a projection/embedding
>> pair; this gives us the nice property that we can repeat adaptations
>> with impunity (if the first adaptation doesn't fail, adapting back and
>> back again is guaranteed to be an identity.)  Even within this,
>> though, there are often multiple ways to implement the adaptation; an
>> embedding can throw on an out-of-range value, or it could pick an
>> in-range target and map to that.  So, for example, if we migrated
>> `Collection::size` to return `long`, for `int`-desiring clients, we
>> could clamp values greater than `MAX_VALUE` to `MAX_VALUE`, rather
>> than throwing -- and this would likely be a better outcome for most
>> clients.  The choice of adaptation should ultimately be left to
>> metadata present at the declaration of the migrated method.
>> 
>> #### Type checking and corner cases
>> 
>> A forwarder should always forward to a non-forwarder method (concrete
>> or abstract) _in the same class_.  (Because they are in the same
>> class, there is no chance that separate compilation can cause a
>> forwarder to point to another forwarder.)
>> 
>> In general, we expect that forwarders are only ever overridden by
>> non-forwarder methods (and then, only in out-of-date classfiles).
>> (This means that invocations that resolve to the forwarder will
>> generally select the forwarder.)
>> 
>> - If a forwarder method is overridden by another forwarder method,
>>   this is probably a result of a migration happening in a subclass
>>   and then later the same migration happens in a superclass.  We can
>>   let the override proceed.
>> - If a forwarder is overridden by a legacy bridge, we have a few bad
>>   choices.  We could accept the bridge (which would interfere with
>>   forwarding), or discard the bridge (which could cause other
>>   anomalies.)  If we leave existing bridge generation alone, this
>>   case is unlikely and accepting the bridge is probably a reasonable
>>   answer; if we migrate bridges to use forwarding, we'd probably
>>   want to err in the other direction.
>> - If a forwarder has a forwardee descriptor that is exactly the
>>   same as the forwarder, the forwarder should be discarded.  (These
>>   can arise from specialization situations.)
>> 
>> 
>> 
>> 
>> [jep181]: https://openjdk.java.net/jeps/181



More information about the valhalla-spec-observers mailing list