From brian.goetz at oracle.com Thu Apr 4 12:33:39 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 4 Apr 2019 08:33:39 -0400 Subject: Updated VM-bridges document Message-ID: At the BUR meeting, we discussed reshuffling the dependency graph to do forwarding+reversing bridges earlier, which has the effect of taking some pressure off of the descriptor language. Here?s an updated doc on forwarding-reversing bridges in the VM. I?ve dropped, for the time being, any discussion of replacing existing generic bridges with this mechanism; we can revisit that later if it makes sense. Instead, I?ve focused solely on the migration aspects. I?ve also dropped any mention of implementation strategy, and instead appealed to ?as if? behavior. ## From Bridges to Forwarders In the Java 1.0 days, `javac` was little more than an "assembler" for the classfile format, translating source code to bytecode in a mostly 1:1 manner. And, we liked it that way; the more predictable the translation scheme, the more effective the runtime optimizations. Even the major upgrade of Java 5 didn't significantly affect the transparency of the resulting classfiles. Over time, we've seen small divergences between the language model and the classfile model, and each of these is a source of sharp edges. In Java 1.1 the addition of inner classes, and the mismatch between the accessibility model in the language and the JVM (the language treated a nest as a single entity; the JVM treat nest members as separate classes) required _access bridges_ (`access$000` methods), which have been the source of various issues over the years. Twenty years later, these methods were obviated by [_Nest-based Access Control_][jep181] -- which represents the choice to align the VM model to the language model, so these adaptation artifacts are no longer required. In Java 5, while we were able to keep the translation largely stable and transparent through the use of erasure, there was one point of misalignment; several situations (covariant overrides, instantiated generic supertypes) could give rise to the situation where two or more method descriptors -- which the JVM treats as distinct methods -- are treated by the language as if they correspond to the same method. To fool the VM, the compiler emits _bridge methods_ which forward invocations from one signature to another. And, as often happens when we try to fool the VM, it ultimately has its revenge. #### Example: covariant overrides Java 5 introduced the ability to override a method but to provide a more specific return type. (Java 8 later extended this to bridges in interfaces as well.) For example: ```{.java} class Parent { Object m() { ... } } class Child extends Parent { @Override String m() { ... } } ``` `Parent` declares a method whose descriptor is `()Object`, and `Child` declares a method with the same name whose descriptor is `()String`. If we compiled this class in the obvious way, the method in `Child` would not override the method in `Parent`, and anyone calling `Parent.m()` would find themselves executing the wrong implementation. The compiler addresses this by providing an additional implementation of `m()`, whose descriptor is `()Object` (an actual override), marked with `ACC_SYNTHETIC` and `ACC_BRIDGE`, whose body invokes `m()String` (with `invokevirtual`), redirecting calls to the right implementation. #### Example: generic substitution A similar situation arises when we have a generic substitution with a superclass. For example: ```{.java} interface Parent { void m(T x); } class Child extends Parent { @Override void m(String x) { ... } } ``` At the language level, it is clear that `Child::m` intends to override `Parent::m`. But the descriptor of `Parent::m` is `(Object)V`, and the descriptor of `Child::m` is `(String)V`, so again a bridge is needed. Because the two signatures -- `m(Object)V` and `m(String)V` -- have been "merged" in this manner, the compiler will prevent subclasses from overriding the bridge signature, in order to maintain the integrity of the bridging scheme. (The first time you encounter an error message informing you of an illegal override in this situation, it can be extremely confusing!) #### Anatomy of a bridge method The bridge methods that are generated by the compiler today operate by _forwarding_. That is, a bridge method `m(X)` is always defined relative to some other method `m(Y)`, and the body of a bridge method pushes its arguments on the stack, adapting them (widening, casting, boxing, etc) the arguments from X to Y, invoking `m(Y)` with `invokevirtual`, and adapting the return type from Y to X, and returning that. Because the bridge uses `invokevirtual`, it need only be generated once, and invocations of the bridge may select a method in a subclass. (The bridge is generated at the "highest" place in the inheritance hierarchy where the need for a bridge is identified, which may be a class or an interface.) #### Bridges are brittle Bridges can be brittle under separate compilation (and, there was a nontrivial bug tail initially.) Separate compilation can move bridges from where already-compiled code expects them to be to places it does not expect them. This can cause the wrong method body to be invoked, or can cause "bridge loops" (resulting in `StackOverflowError`). (These anomalies disappear if the entire hierarchy is consistently recompiled; they are solely an artifact of inconsistent separate compilation.) The basic problem with bridge methods is that the language views the two method descriptors as two faces of the same actual method, whereas the JVM sees them as distinct methods. (And, reflection also has to participate in the charade.) #### Limits of bridge methods Bridge methods have worked well enough for the uses to which we've put them, but there are a number of desirable scenarios where bridge methods ultimately run out of gas. These scenarios stem from various forms of _migration_, and the desire to make these migrations binary-compatible. The problem of migration arises both from language evolution (Valhalla aims to enable compatible migrating from value-based classes to value types, and from erased generics to specialized), as well as from the ordinary evolution of libraries. An example of the "ordinary migration" problem is the replacement of the old `Date` classes with `LocalDateTime` and friends. We can easily add new the classes to the JDK, along with conversions to and from the old types, but there are existing APIs that still deal in `Date` -- and if we ever want to be able to deprecate the old versions, we have to find a way to compatibly migrate APIs that deal in `Date` to the new types. (The extreme form of this is the "Collections 2.0" problem; we could surely write a new Collections library, but when nearly every API deals in `List`, unless we can migrate these away, what would be the point?) Migration scenarios like these pose two problems that bridge methods cannot solve: - **Fields.** While we can often reroute method invocations with bridges, we have no similar mechanism for fields. If a field signature changes (whether due to changes in the translation strategy, or changes in the API), there is no way to make this binary-compatible. - **Overrides.** Bridges allow us to reroute _invocations_ of methods, but not _overrides_ of methods. If a method descriptor in a non-final class changes, but has subclasses in a separate maintenance domain that continue to use the old descriptor, what is intended to be an override may accidentally become an overload, or might override the bridge instead of the actual method. #### Wildcards and polymorphic fields A non-migration application for bridges that comes out of Valhalla is _wildcards_. For a class `C` with a method `m(T)`, the wildcard `C` (the class type) has an abstract method `m(Object)`, which needs to be implemented by each species type. This is, effectively, a bridge; the method `m(Object)` generated for the species adapts the arguments and forwards to the "real" (`m(T)`) method. While this could be implemented using straightforward code generation in the static compiler, it may be preferable to treat this as a bridge as well. More importantly, the same is true for fields; if `C` has a field of type `T`, then the wildcard `C` will expose this field as if it were of type `Object`. This cannot be implemented using straightforward code generation in the static compiler (without undermining the promise of migration compatibility.) ## Forwarding In this document, we attempt to learn from the history of bridges, and create a new mechanism -- _forwarders_ -- that work with the JVM instead of against it. This raises the level of expressivity of classfiles and opens the possibility of greater laziness. It is possible that traditional bridging scenarios can eventually be handled by forwarders too, but for purposes of this document, we will focus exclusively on the migration scenarios. A _forwarder_ is a non-abstract method that, instead of a `Code` attribute, has a `Forwarding` attribute: ``` Forwarding { u2 name; u4 length; u2 forwardeeDescriptor; } ``` Let's assume that forwarders have the `ACC_FORWARDER` and `ACC_SYNTHETIC` bits (in reality we will likely overload `ACC_BRIDGE`). When compiling a method (concrete or abstract) that has been migrated from an old descriptor to a new descriptor (such as migrating `m(Object)V` to `m(String)V`), the compiler would generate an ordinary method with the new descriptor, and a forwarder with the old descriptor which forwarders to the new descriptor. This captures the statement that there used to be a method called `m` with the old descriptor, but it migrated to the new descriptor -- so that the JVM can transparently adjust the behavior of clients and overriders that were not aware of the migration. #### Invocation of forwarders Given a forwarder in a class with name `N` and descriptor `D` that forwards to descriptor `E`, define `M` by: MethodHandle M = MethodHandles.lookup() .findVirtual(thisClass, N, E); If the forwarder is _selected_ as the target of an `invokevirtual`, the behavior should be _as if_ the caller invoked `M.asType(D)`, where the arguments of `D` are adapted to their counterparts in `E`, and the return type in `E` is adapted back to the return type in `D`. (We may wish to reduce the set of built-in adaptations to a smaller set than those implemented by `MethodHandle::asType`, for simplicity, based on requirements.) Because forwarders exist for migration, we hope that over time, callers will migrate from the old descriptor to the new, rendering forwarders vestigial. As a result, we may wish to defer as much of the bridge generation logic as possible to first-selection time. #### Forwarders for fields The forwarding strategy can be applied to fields as well. In this case, the forwardee descriptor is that of a field descriptor, and the behavior has the same semantics as adapting a target field accessor method handle to the type of the bridge descriptor. (If the forwarder field is static, then the field should be static too.) #### Overriding of forwarders Capturing forwarding information declaratively enables us to detect when a class overrides a forwarder descriptor with a non-forwarder (which indicates that the subclass is out of date with its supertypes) and redirect the override to the actual method (with arguments and return values adapted.) Given a forwarder in a class `A` with name `N` and descriptor `D` that forwards to descriptor `E`, suppose a subclass `B` overrides the forwarder with `N(D)`. Let `M` be the method handle that corresponds to the `Code` attribute of `B.N(D)`. We would like it to behave as if `B` had instead specified a method `N(E)`, whose `Code` attribute corresponded to `M.asType(E)`. #### Additional adaptations The uses we anticipate for L100 all can be done with `asType()` adaptations (in fact, with a subset of `asType()` adaptations). However, if we wish to support user-provided migrations (such as migrating libraries that use `Date` to `LocalDateTime`) or migrate complex JDK APIs such as `Stream`, we may need to provide additional adaptation logic in the `ForwardingBridge` attribute. Let's extend the `Forwarding` attribute: ``` Forwarding { u2 name; u4 length; u2 forwardeeDescriptor; u2 adapter; } ``` where `adaptor` is the constant pool index of a method handle whose type is `(MethodHandle;MethodType;)MethodHandle;` (note that the method handle for `MethodHandle::asType` has this shape). If `adapter` is zero, we use the built-in adaptations; if it is nonzero, we use the referred-to method handle to adapt between the forwarder and forwardee descriptors (in both directions). #### Adaptation failures and limitations Whatever adaptations we are prepared to do between forwarder and forwardee, we need to be prepared to do them in both directions; if a method `m(int)` is migrated to `m(long)`, invocation arguments will be adapted `int` to `long`, but if overridden, we'll do the reverse adaptation on the (out of date) overrider `m(int)`. Given that most adaptations are not between isomorphic domains, there will be cases in one direction or the other that cannot be represented (`long` to `int` is lossy; `Integer` to `int` can NPE; `Object` to `String` can CCE.) Our guidance is that adaptations should form a projection/embedding pair; this gives us the nice property that we can repeat adaptations with impunity (if the first adaptation doesn't fail, adapting back and back again is guaranteed to be an identity.) Even within this, though, there are often multiple ways to implement the adaptation; an embedding can throw on an out-of-range value, or it could pick an in-range target and map to that. So, for example, if we migrated `Collection::size` to return `long`, for `int`-desiring clients, we could clamp values greater than `MAX_VALUE` to `MAX_VALUE`, rather than throwing -- and this would likely be a better outcome for most clients. The choice of adaptation should ultimately be left to metadata present at the declaration of the migrated method. #### Type checking and corner cases A forwarder should always forward to a non-forwarder method (concrete or abstract) _in the same class_. (Because they are in the same class, there is no chance that separate compilation can cause a forwarder to point to another forwarder.) In general, we expect that forwarders are only ever overridden by non-forwarder methods (and then, only in out-of-date classfiles). (This means that invocations that resolve to the forwarder will generally select the forwarder.) - If a forwarder method is overridden by another forwarder method, this is probably a result of a migration happening in a subclass and then later the same migration happens in a superclass. We can let the override proceed. - If a forwarder is overridden by a legacy bridge, we have a few bad choices. We could accept the bridge (which would interfere with forwarding), or discard the bridge (which could cause other anomalies.) If we leave existing bridge generation alone, this case is unlikely and accepting the bridge is probably a reasonable answer; if we migrate bridges to use forwarding, we'd probably want to err in the other direction. - If a forwarder has a forwardee descriptor that is exactly the same as the forwarder, the forwarder should be discarded. (These can arise from specialization situations.) [jep181]: https://openjdk.java.net/jeps/181 From forax at univ-mlv.fr Sun Apr 7 21:30:43 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 7 Apr 2019 23:30:43 +0200 (CEST) Subject: Updated VM-bridges document In-Reply-To: References: Message-ID: <630462416.8392.1554672643213.JavaMail.zimbra@u-pem.fr> Hi Brian, with an hat of JVM language implementor, the attribute ForwardingBridge is important because it allows to simulate things that are not co/contravariant for the VM but are from the language point of view, like between an int and a Long. I don't see the point of having two attributes, given that ForwardingBridge supports a superset of what Forwarding support, it seems to be a premature optimization to me. The other thing is that Forwarding bridge should not use an adapter but a bootstrap method. Historically, indy was using something similar to your adapter but this design doesn't support sharing code because here the adapter code has to be present in the same class as the forwarding method. If you want to share code, the shared code may want to access to the class containing the forwarding method, that's why you need a Lookup object and the share code may want to have some specific arguments, aka the bootstrap arguments. I think John has already proposed that we should support a bootstrap method that returns a MethodHandle instead of a CallSite, this is exactly what we need here. regards, R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "valhalla-spec-experts" > Envoy?: Jeudi 4 Avril 2019 14:33:39 > Objet: Updated VM-bridges document > At the BUR meeting, we discussed reshuffling the dependency graph to do > forwarding+reversing bridges earlier, which has the effect of taking some > pressure off of the descriptor language. Here?s an updated doc on > forwarding-reversing bridges in the VM. > > I?ve dropped, for the time being, any discussion of replacing existing generic > bridges with this mechanism; we can revisit that later if it makes sense. > Instead, I?ve focused solely on the migration aspects. I?ve also dropped any > mention of implementation strategy, and instead appealed to ?as if? behavior. > > > ## From Bridges to Forwarders > > In the Java 1.0 days, `javac` was little more than an "assembler" for > the classfile format, translating source code to bytecode in a mostly > 1:1 manner. And, we liked it that way; the more predictable the > translation scheme, the more effective the runtime optimizations. > Even the major upgrade of Java 5 didn't significantly affect the > transparency of the resulting classfiles. > > Over time, we've seen small divergences between the language model and > the classfile model, and each of these is a source of sharp edges. In > Java 1.1 the addition of inner classes, and the mismatch between the > accessibility model in the language and the JVM (the language treated > a nest as a single entity; the JVM treat nest members as separate > classes) required _access bridges_ (`access$000` methods), which have > been the source of various issues over the years. Twenty years later, > these methods were obviated by [_Nest-based Access Control_][jep181] > -- which represents the choice to align the VM model to the language > model, so these adaptation artifacts are no longer required. > > In Java 5, while we were able to keep the translation largely stable > and transparent through the use of erasure, there was one point of > misalignment; several situations (covariant overrides, instantiated > generic supertypes) could give rise to the situation where two or more > method descriptors -- which the JVM treats as distinct methods -- are > treated by the language as if they correspond to the same method. To > fool the VM, the compiler emits _bridge methods_ which forward > invocations from one signature to another. And, as often happens when > we try to fool the VM, it ultimately has its revenge. > > #### Example: covariant overrides > > Java 5 introduced the ability to override a method but to provide a > more specific return type. (Java 8 later extended this to bridges in > interfaces as well.) For example: > > ```{.java} > class Parent { > Object m() { ... } > } > > class Child extends Parent { > @Override > String m() { ... } > } > ``` > > `Parent` declares a method whose descriptor is `()Object`, and `Child` > declares a method with the same name whose descriptor is `()String`. > If we compiled this class in the obvious way, the method in `Child` > would not override the method in `Parent`, and anyone calling > `Parent.m()` would find themselves executing the wrong implementation. > > The compiler addresses this by providing an additional implementation > of `m()`, whose descriptor is `()Object` (an actual override), marked > with `ACC_SYNTHETIC` and `ACC_BRIDGE`, whose body invokes `m()String` > (with `invokevirtual`), redirecting calls to the right implementation. > > #### Example: generic substitution > > A similar situation arises when we have a generic substitution with a > superclass. For example: > > ```{.java} > interface Parent { > void m(T x); > } > > class Child extends Parent { > @Override > void m(String x) { ... } > } > ``` > > At the language level, it is clear that `Child::m` intends to override > `Parent::m`. But the descriptor of `Parent::m` is `(Object)V`, and > the descriptor of `Child::m` is `(String)V`, so again a bridge is > needed. > > Because the two signatures -- `m(Object)V` and `m(String)V` -- have > been "merged" in this manner, the compiler will prevent subclasses > from overriding the bridge signature, in order to maintain the > integrity of the bridging scheme. (The first time you encounter an > error message informing you of an illegal override in this situation, > it can be extremely confusing!) > > #### Anatomy of a bridge method > > The bridge methods that are generated by the compiler today operate by > _forwarding_. That is, a bridge method `m(X)` is always defined > relative to some other method `m(Y)`, and the body of a bridge method > pushes its arguments on the stack, adapting them (widening, casting, > boxing, etc) the arguments from X to Y, invoking `m(Y)` with > `invokevirtual`, and adapting the return type from Y to X, and > returning that. Because the bridge uses `invokevirtual`, it need only > be generated once, and invocations of the bridge may select a method > in a subclass. (The bridge is generated at the "highest" place in the > inheritance hierarchy where the need for a bridge is identified, > which may be a class or an interface.) > > #### Bridges are brittle > > Bridges can be brittle under separate compilation (and, there was a > nontrivial bug tail initially.) Separate compilation can move bridges > from where already-compiled code expects them to be to places it does > not expect them. This can cause the wrong method body to be invoked, > or can cause "bridge loops" (resulting in `StackOverflowError`). > (These anomalies disappear if the entire hierarchy is consistently > recompiled; they are solely an artifact of inconsistent separate > compilation.) > > The basic problem with bridge methods is that the language views the > two method descriptors as two faces of the same actual method, whereas > the JVM sees them as distinct methods. (And, reflection also has to > participate in the charade.) > > #### Limits of bridge methods > > Bridge methods have worked well enough for the uses to which we've put > them, but there are a number of desirable scenarios where bridge > methods ultimately run out of gas. These scenarios stem from various > forms of _migration_, and the desire to make these migrations > binary-compatible. > > The problem of migration arises both from language evolution (Valhalla > aims to enable compatible migrating from value-based classes to value > types, and from erased generics to specialized), as well as from the > ordinary evolution of libraries. > > An example of the "ordinary migration" problem is the replacement of > the old `Date` classes with `LocalDateTime` and friends. We can > easily add new the classes to the JDK, along with conversions to and > from the old types, but there are existing APIs that still deal in > `Date` -- and if we ever want to be able to deprecate the old > versions, we have to find a way to compatibly migrate APIs that deal > in `Date` to the new types. (The extreme form of this is the > "Collections 2.0" problem; we could surely write a new Collections > library, but when nearly every API deals in `List`, unless we can > migrate these away, what would be the point?) > > Migration scenarios like these pose two problems that bridge methods > cannot solve: > > - **Fields.** While we can often reroute method invocations with > bridges, we have no similar mechanism for fields. If a field > signature changes (whether due to changes in the translation > strategy, or changes in the API), there is no way to make this > binary-compatible. > - **Overrides.** Bridges allow us to reroute _invocations_ of > methods, but not _overrides_ of methods. If a method descriptor > in a non-final class changes, but has subclasses in a separate > maintenance domain that continue to use the old descriptor, what > is intended to be an override may accidentally become an overload, > or might override the bridge instead of the actual method. > > #### Wildcards and polymorphic fields > > A non-migration application for bridges that comes out of Valhalla is > _wildcards_. For a class `C` with a method `m(T)`, the wildcard > `C` (the class type) has an abstract method `m(Object)`, which > needs to be implemented by each species type. This is, effectively, a > bridge; the method `m(Object)` generated for the species adapts the > arguments and forwards to the "real" (`m(T)`) method. While this > could be implemented using straightforward code generation in the > static compiler, it may be preferable to treat this as a bridge as > well. > > More importantly, the same is true for fields; if `C` has a field > of type `T`, then the wildcard `C` will expose this field as if it > were of type `Object`. This cannot be implemented using > straightforward code generation in the static compiler (without > undermining the promise of migration compatibility.) > > ## Forwarding > > In this document, we attempt to learn from the history of bridges, and > create a new mechanism -- _forwarders_ -- that work with the JVM > instead of against it. This raises the level of expressivity of > classfiles and opens the possibility of greater laziness. It is > possible that traditional bridging scenarios can eventually be handled > by forwarders too, but for purposes of this document, we will focus > exclusively on the migration scenarios. > > A _forwarder_ is a non-abstract method that, instead of a `Code` > attribute, has a `Forwarding` attribute: > > ``` > Forwarding { > u2 name; > u4 length; > u2 forwardeeDescriptor; > } > ``` > > Let's assume that forwarders have the `ACC_FORWARDER` and > `ACC_SYNTHETIC` bits (in reality we will likely overload > `ACC_BRIDGE`). > > When compiling a method (concrete or abstract) that has been migrated > from an old descriptor to a new descriptor (such as migrating > `m(Object)V` to `m(String)V`), the compiler would generate an ordinary > method with the new descriptor, and a forwarder with the old > descriptor which forwarders to the new descriptor. This captures the > statement that there used to be a method called `m` with the old > descriptor, but it migrated to the new descriptor -- so that the JVM > can transparently adjust the behavior of clients and overriders that > were not aware of the migration. > > #### Invocation of forwarders > > Given a forwarder in a class with name `N` and descriptor `D` that > forwards to descriptor `E`, define `M` by: > > MethodHandle M = MethodHandles.lookup() > .findVirtual(thisClass, N, E); > > If the forwarder is _selected_ as the target of an `invokevirtual`, > the behavior should be _as if_ the caller invoked `M.asType(D)`, where > the arguments of `D` are adapted to their counterparts in `E`, and the > return type in `E` is adapted back to the return type in `D`. (We may > wish to reduce the set of built-in adaptations to a smaller set than > those implemented by `MethodHandle::asType`, for simplicity, based on > requirements.) > > Because forwarders exist for migration, we hope that over time, > callers will migrate from the old descriptor to the new, rendering > forwarders vestigial. As a result, we may wish to defer as much of > the bridge generation logic as possible to first-selection time. > > #### Forwarders for fields > > The forwarding strategy can be applied to fields as well. In this > case, the forwardee descriptor is that of a field descriptor, and the > behavior has the same semantics as adapting a target field accessor > method handle to the type of the bridge descriptor. (If the forwarder > field is static, then the field should be static too.) > > #### Overriding of forwarders > > Capturing forwarding information declaratively enables us to detect > when a class overrides a forwarder descriptor with a non-forwarder > (which indicates that the subclass is out of date with its supertypes) > and redirect the override to the actual method (with arguments and > return values adapted.) > > Given a forwarder in a class `A` with name `N` and descriptor `D` that > forwards to descriptor `E`, suppose a subclass `B` overrides the > forwarder with `N(D)`. Let `M` be the method handle that corresponds > to the `Code` attribute of `B.N(D)`. We would like it to behave as if > `B` had instead specified a method `N(E)`, whose `Code` attribute > corresponded to `M.asType(E)`. > > #### Additional adaptations > > The uses we anticipate for L100 all can be done with `asType()` > adaptations (in fact, with a subset of `asType()` adaptations). > However, if we wish to support user-provided migrations (such as > migrating libraries that use `Date` to `LocalDateTime`) or migrate > complex JDK APIs such as `Stream`, we may need to provide additional > adaptation logic in the `ForwardingBridge` attribute. Let's extend > the `Forwarding` attribute: > > ``` > Forwarding { > u2 name; > u4 length; > u2 forwardeeDescriptor; > u2 adapter; > } > ``` it's ForwardingBridge here ? > > where `adaptor` is the constant pool index of a method handle whose > type is `(MethodHandle;MethodType;)MethodHandle;` (note that the > method handle for `MethodHandle::asType` has this shape). If > `adapter` is zero, we use the built-in adaptations; if it is nonzero, > we use the referred-to method handle to adapt between the forwarder > and forwardee descriptors (in both directions). > > #### Adaptation failures and limitations > > Whatever adaptations we are prepared to do between forwarder and > forwardee, we need to be prepared to do them in both directions; if a > method `m(int)` is migrated to `m(long)`, invocation arguments will be > adapted `int` to `long`, but if overridden, we'll do the reverse > adaptation on the (out of date) overrider `m(int)`. Given that most > adaptations are not between isomorphic domains, there will be cases in > one direction or the other that cannot be represented (`long` to > `int` is lossy; `Integer` to `int` can NPE; `Object` to `String` can > CCE.) > > Our guidance is that adaptations should form a projection/embedding > pair; this gives us the nice property that we can repeat adaptations > with impunity (if the first adaptation doesn't fail, adapting back and > back again is guaranteed to be an identity.) Even within this, > though, there are often multiple ways to implement the adaptation; an > embedding can throw on an out-of-range value, or it could pick an > in-range target and map to that. So, for example, if we migrated > `Collection::size` to return `long`, for `int`-desiring clients, we > could clamp values greater than `MAX_VALUE` to `MAX_VALUE`, rather > than throwing -- and this would likely be a better outcome for most > clients. The choice of adaptation should ultimately be left to > metadata present at the declaration of the migrated method. > > #### Type checking and corner cases > > A forwarder should always forward to a non-forwarder method (concrete > or abstract) _in the same class_. (Because they are in the same > class, there is no chance that separate compilation can cause a > forwarder to point to another forwarder.) > > In general, we expect that forwarders are only ever overridden by > non-forwarder methods (and then, only in out-of-date classfiles). > (This means that invocations that resolve to the forwarder will > generally select the forwarder.) > > - If a forwarder method is overridden by another forwarder method, > this is probably a result of a migration happening in a subclass > and then later the same migration happens in a superclass. We can > let the override proceed. > - If a forwarder is overridden by a legacy bridge, we have a few bad > choices. We could accept the bridge (which would interfere with > forwarding), or discard the bridge (which could cause other > anomalies.) If we leave existing bridge generation alone, this > case is unlikely and accepting the bridge is probably a reasonable > answer; if we migrate bridges to use forwarding, we'd probably > want to err in the other direction. > - If a forwarder has a forwardee descriptor that is exactly the > same as the forwarder, the forwarder should be discarded. (These > can arise from specialization situations.) > > > > > [jep181]: https://openjdk.java.net/jeps/181 From brian.goetz at oracle.com Mon Apr 8 01:18:43 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 7 Apr 2019 21:18:43 -0400 Subject: Updated VM-bridges document In-Reply-To: <630462416.8392.1554672643213.JavaMail.zimbra@u-pem.fr> References: <630462416.8392.1554672643213.JavaMail.zimbra@u-pem.fr> Message-ID: That?s just a typo. There?s just one attribute. > On Apr 7, 2019, at 5:30 PM, Remi Forax wrote: > > Hi Brian, > with an hat of JVM language implementor, > the attribute ForwardingBridge is important because it allows to simulate things that are not co/contravariant for the VM but are from the language point of view, like between an int and a Long. > > I don't see the point of having two attributes, given that ForwardingBridge supports a superset of what Forwarding support, it seems to be a premature optimization to me. > > The other thing is that Forwarding bridge should not use an adapter but a bootstrap method. > Historically, indy was using something similar to your adapter but this design doesn't support sharing code because here the adapter code has to be present in the same class as the forwarding method. > If you want to share code, the shared code may want to access to the class containing the forwarding method, that's why you need a Lookup object and the share code may want to have some specific arguments, aka the bootstrap arguments. > > I think John has already proposed that we should support a bootstrap method that returns a MethodHandle instead of a CallSite, this is exactly what we need here. > > regards, > R?mi > > ----- Mail original ----- >> De: "Brian Goetz" >> ?: "valhalla-spec-experts" >> Envoy?: Jeudi 4 Avril 2019 14:33:39 >> Objet: Updated VM-bridges document > >> At the BUR meeting, we discussed reshuffling the dependency graph to do >> forwarding+reversing bridges earlier, which has the effect of taking some >> pressure off of the descriptor language. Here?s an updated doc on >> forwarding-reversing bridges in the VM. >> >> I?ve dropped, for the time being, any discussion of replacing existing generic >> bridges with this mechanism; we can revisit that later if it makes sense. >> Instead, I?ve focused solely on the migration aspects. I?ve also dropped any >> mention of implementation strategy, and instead appealed to ?as if? behavior. >> >> >> ## From Bridges to Forwarders >> >> In the Java 1.0 days, `javac` was little more than an "assembler" for >> the classfile format, translating source code to bytecode in a mostly >> 1:1 manner. And, we liked it that way; the more predictable the >> translation scheme, the more effective the runtime optimizations. >> Even the major upgrade of Java 5 didn't significantly affect the >> transparency of the resulting classfiles. >> >> Over time, we've seen small divergences between the language model and >> the classfile model, and each of these is a source of sharp edges. In >> Java 1.1 the addition of inner classes, and the mismatch between the >> accessibility model in the language and the JVM (the language treated >> a nest as a single entity; the JVM treat nest members as separate >> classes) required _access bridges_ (`access$000` methods), which have >> been the source of various issues over the years. Twenty years later, >> these methods were obviated by [_Nest-based Access Control_][jep181] >> -- which represents the choice to align the VM model to the language >> model, so these adaptation artifacts are no longer required. >> >> In Java 5, while we were able to keep the translation largely stable >> and transparent through the use of erasure, there was one point of >> misalignment; several situations (covariant overrides, instantiated >> generic supertypes) could give rise to the situation where two or more >> method descriptors -- which the JVM treats as distinct methods -- are >> treated by the language as if they correspond to the same method. To >> fool the VM, the compiler emits _bridge methods_ which forward >> invocations from one signature to another. And, as often happens when >> we try to fool the VM, it ultimately has its revenge. >> >> #### Example: covariant overrides >> >> Java 5 introduced the ability to override a method but to provide a >> more specific return type. (Java 8 later extended this to bridges in >> interfaces as well.) For example: >> >> ```{.java} >> class Parent { >> Object m() { ... } >> } >> >> class Child extends Parent { >> @Override >> String m() { ... } >> } >> ``` >> >> `Parent` declares a method whose descriptor is `()Object`, and `Child` >> declares a method with the same name whose descriptor is `()String`. >> If we compiled this class in the obvious way, the method in `Child` >> would not override the method in `Parent`, and anyone calling >> `Parent.m()` would find themselves executing the wrong implementation. >> >> The compiler addresses this by providing an additional implementation >> of `m()`, whose descriptor is `()Object` (an actual override), marked >> with `ACC_SYNTHETIC` and `ACC_BRIDGE`, whose body invokes `m()String` >> (with `invokevirtual`), redirecting calls to the right implementation. >> >> #### Example: generic substitution >> >> A similar situation arises when we have a generic substitution with a >> superclass. For example: >> >> ```{.java} >> interface Parent { >> void m(T x); >> } >> >> class Child extends Parent { >> @Override >> void m(String x) { ... } >> } >> ``` >> >> At the language level, it is clear that `Child::m` intends to override >> `Parent::m`. But the descriptor of `Parent::m` is `(Object)V`, and >> the descriptor of `Child::m` is `(String)V`, so again a bridge is >> needed. >> >> Because the two signatures -- `m(Object)V` and `m(String)V` -- have >> been "merged" in this manner, the compiler will prevent subclasses >> from overriding the bridge signature, in order to maintain the >> integrity of the bridging scheme. (The first time you encounter an >> error message informing you of an illegal override in this situation, >> it can be extremely confusing!) >> >> #### Anatomy of a bridge method >> >> The bridge methods that are generated by the compiler today operate by >> _forwarding_. That is, a bridge method `m(X)` is always defined >> relative to some other method `m(Y)`, and the body of a bridge method >> pushes its arguments on the stack, adapting them (widening, casting, >> boxing, etc) the arguments from X to Y, invoking `m(Y)` with >> `invokevirtual`, and adapting the return type from Y to X, and >> returning that. Because the bridge uses `invokevirtual`, it need only >> be generated once, and invocations of the bridge may select a method >> in a subclass. (The bridge is generated at the "highest" place in the >> inheritance hierarchy where the need for a bridge is identified, >> which may be a class or an interface.) >> >> #### Bridges are brittle >> >> Bridges can be brittle under separate compilation (and, there was a >> nontrivial bug tail initially.) Separate compilation can move bridges >> from where already-compiled code expects them to be to places it does >> not expect them. This can cause the wrong method body to be invoked, >> or can cause "bridge loops" (resulting in `StackOverflowError`). >> (These anomalies disappear if the entire hierarchy is consistently >> recompiled; they are solely an artifact of inconsistent separate >> compilation.) >> >> The basic problem with bridge methods is that the language views the >> two method descriptors as two faces of the same actual method, whereas >> the JVM sees them as distinct methods. (And, reflection also has to >> participate in the charade.) >> >> #### Limits of bridge methods >> >> Bridge methods have worked well enough for the uses to which we've put >> them, but there are a number of desirable scenarios where bridge >> methods ultimately run out of gas. These scenarios stem from various >> forms of _migration_, and the desire to make these migrations >> binary-compatible. >> >> The problem of migration arises both from language evolution (Valhalla >> aims to enable compatible migrating from value-based classes to value >> types, and from erased generics to specialized), as well as from the >> ordinary evolution of libraries. >> >> An example of the "ordinary migration" problem is the replacement of >> the old `Date` classes with `LocalDateTime` and friends. We can >> easily add new the classes to the JDK, along with conversions to and >> from the old types, but there are existing APIs that still deal in >> `Date` -- and if we ever want to be able to deprecate the old >> versions, we have to find a way to compatibly migrate APIs that deal >> in `Date` to the new types. (The extreme form of this is the >> "Collections 2.0" problem; we could surely write a new Collections >> library, but when nearly every API deals in `List`, unless we can >> migrate these away, what would be the point?) >> >> Migration scenarios like these pose two problems that bridge methods >> cannot solve: >> >> - **Fields.** While we can often reroute method invocations with >> bridges, we have no similar mechanism for fields. If a field >> signature changes (whether due to changes in the translation >> strategy, or changes in the API), there is no way to make this >> binary-compatible. >> - **Overrides.** Bridges allow us to reroute _invocations_ of >> methods, but not _overrides_ of methods. If a method descriptor >> in a non-final class changes, but has subclasses in a separate >> maintenance domain that continue to use the old descriptor, what >> is intended to be an override may accidentally become an overload, >> or might override the bridge instead of the actual method. >> >> #### Wildcards and polymorphic fields >> >> A non-migration application for bridges that comes out of Valhalla is >> _wildcards_. For a class `C` with a method `m(T)`, the wildcard >> `C` (the class type) has an abstract method `m(Object)`, which >> needs to be implemented by each species type. This is, effectively, a >> bridge; the method `m(Object)` generated for the species adapts the >> arguments and forwards to the "real" (`m(T)`) method. While this >> could be implemented using straightforward code generation in the >> static compiler, it may be preferable to treat this as a bridge as >> well. >> >> More importantly, the same is true for fields; if `C` has a field >> of type `T`, then the wildcard `C` will expose this field as if it >> were of type `Object`. This cannot be implemented using >> straightforward code generation in the static compiler (without >> undermining the promise of migration compatibility.) >> >> ## Forwarding >> >> In this document, we attempt to learn from the history of bridges, and >> create a new mechanism -- _forwarders_ -- that work with the JVM >> instead of against it. This raises the level of expressivity of >> classfiles and opens the possibility of greater laziness. It is >> possible that traditional bridging scenarios can eventually be handled >> by forwarders too, but for purposes of this document, we will focus >> exclusively on the migration scenarios. >> >> A _forwarder_ is a non-abstract method that, instead of a `Code` >> attribute, has a `Forwarding` attribute: >> >> ``` >> Forwarding { >> u2 name; >> u4 length; >> u2 forwardeeDescriptor; >> } >> ``` >> >> Let's assume that forwarders have the `ACC_FORWARDER` and >> `ACC_SYNTHETIC` bits (in reality we will likely overload >> `ACC_BRIDGE`). >> >> When compiling a method (concrete or abstract) that has been migrated >> from an old descriptor to a new descriptor (such as migrating >> `m(Object)V` to `m(String)V`), the compiler would generate an ordinary >> method with the new descriptor, and a forwarder with the old >> descriptor which forwarders to the new descriptor. This captures the >> statement that there used to be a method called `m` with the old >> descriptor, but it migrated to the new descriptor -- so that the JVM >> can transparently adjust the behavior of clients and overriders that >> were not aware of the migration. >> >> #### Invocation of forwarders >> >> Given a forwarder in a class with name `N` and descriptor `D` that >> forwards to descriptor `E`, define `M` by: >> >> MethodHandle M = MethodHandles.lookup() >> .findVirtual(thisClass, N, E); >> >> If the forwarder is _selected_ as the target of an `invokevirtual`, >> the behavior should be _as if_ the caller invoked `M.asType(D)`, where >> the arguments of `D` are adapted to their counterparts in `E`, and the >> return type in `E` is adapted back to the return type in `D`. (We may >> wish to reduce the set of built-in adaptations to a smaller set than >> those implemented by `MethodHandle::asType`, for simplicity, based on >> requirements.) >> >> Because forwarders exist for migration, we hope that over time, >> callers will migrate from the old descriptor to the new, rendering >> forwarders vestigial. As a result, we may wish to defer as much of >> the bridge generation logic as possible to first-selection time. >> >> #### Forwarders for fields >> >> The forwarding strategy can be applied to fields as well. In this >> case, the forwardee descriptor is that of a field descriptor, and the >> behavior has the same semantics as adapting a target field accessor >> method handle to the type of the bridge descriptor. (If the forwarder >> field is static, then the field should be static too.) >> >> #### Overriding of forwarders >> >> Capturing forwarding information declaratively enables us to detect >> when a class overrides a forwarder descriptor with a non-forwarder >> (which indicates that the subclass is out of date with its supertypes) >> and redirect the override to the actual method (with arguments and >> return values adapted.) >> >> Given a forwarder in a class `A` with name `N` and descriptor `D` that >> forwards to descriptor `E`, suppose a subclass `B` overrides the >> forwarder with `N(D)`. Let `M` be the method handle that corresponds >> to the `Code` attribute of `B.N(D)`. We would like it to behave as if >> `B` had instead specified a method `N(E)`, whose `Code` attribute >> corresponded to `M.asType(E)`. >> >> #### Additional adaptations >> >> The uses we anticipate for L100 all can be done with `asType()` >> adaptations (in fact, with a subset of `asType()` adaptations). >> However, if we wish to support user-provided migrations (such as >> migrating libraries that use `Date` to `LocalDateTime`) or migrate >> complex JDK APIs such as `Stream`, we may need to provide additional >> adaptation logic in the `ForwardingBridge` attribute. Let's extend >> the `Forwarding` attribute: >> >> ``` >> Forwarding { >> u2 name; >> u4 length; >> u2 forwardeeDescriptor; >> u2 adapter; >> } >> ``` > > it's ForwardingBridge here ? > >> >> where `adaptor` is the constant pool index of a method handle whose >> type is `(MethodHandle;MethodType;)MethodHandle;` (note that the >> method handle for `MethodHandle::asType` has this shape). If >> `adapter` is zero, we use the built-in adaptations; if it is nonzero, >> we use the referred-to method handle to adapt between the forwarder >> and forwardee descriptors (in both directions). >> >> #### Adaptation failures and limitations >> >> Whatever adaptations we are prepared to do between forwarder and >> forwardee, we need to be prepared to do them in both directions; if a >> method `m(int)` is migrated to `m(long)`, invocation arguments will be >> adapted `int` to `long`, but if overridden, we'll do the reverse >> adaptation on the (out of date) overrider `m(int)`. Given that most >> adaptations are not between isomorphic domains, there will be cases in >> one direction or the other that cannot be represented (`long` to >> `int` is lossy; `Integer` to `int` can NPE; `Object` to `String` can >> CCE.) >> >> Our guidance is that adaptations should form a projection/embedding >> pair; this gives us the nice property that we can repeat adaptations >> with impunity (if the first adaptation doesn't fail, adapting back and >> back again is guaranteed to be an identity.) Even within this, >> though, there are often multiple ways to implement the adaptation; an >> embedding can throw on an out-of-range value, or it could pick an >> in-range target and map to that. So, for example, if we migrated >> `Collection::size` to return `long`, for `int`-desiring clients, we >> could clamp values greater than `MAX_VALUE` to `MAX_VALUE`, rather >> than throwing -- and this would likely be a better outcome for most >> clients. The choice of adaptation should ultimately be left to >> metadata present at the declaration of the migrated method. >> >> #### Type checking and corner cases >> >> A forwarder should always forward to a non-forwarder method (concrete >> or abstract) _in the same class_. (Because they are in the same >> class, there is no chance that separate compilation can cause a >> forwarder to point to another forwarder.) >> >> In general, we expect that forwarders are only ever overridden by >> non-forwarder methods (and then, only in out-of-date classfiles). >> (This means that invocations that resolve to the forwarder will >> generally select the forwarder.) >> >> - If a forwarder method is overridden by another forwarder method, >> this is probably a result of a migration happening in a subclass >> and then later the same migration happens in a superclass. We can >> let the override proceed. >> - If a forwarder is overridden by a legacy bridge, we have a few bad >> choices. We could accept the bridge (which would interfere with >> forwarding), or discard the bridge (which could cause other >> anomalies.) If we leave existing bridge generation alone, this >> case is unlikely and accepting the bridge is probably a reasonable >> answer; if we migrate bridges to use forwarding, we'd probably >> want to err in the other direction. >> - If a forwarder has a forwardee descriptor that is exactly the >> same as the forwarder, the forwarder should be discarded. (These >> can arise from specialization situations.) >> >> >> >> >> [jep181]: https://openjdk.java.net/jeps/181 From brian.goetz at oracle.com Mon Apr 8 14:24:21 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 8 Apr 2019 10:24:21 -0400 Subject: Updated VM-bridges document In-Reply-To: <630462416.8392.1554672643213.JavaMail.zimbra@u-pem.fr> References: <630462416.8392.1554672643213.JavaMail.zimbra@u-pem.fr> Message-ID: > > The other thing is that Forwarding bridge should not use an adapter but a bootstrap method. Can you explain exactly what you mean here? Because in my mind, the adapter _is_ a bootstrap method ? it is code to which the VM upcalls at preparation / link time to help establish linkage. Since you obviously have a more specific notion of ?bootstrap method?, can you explain exactly what you mean? From brian.goetz at oracle.com Mon Apr 8 16:50:28 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 8 Apr 2019 12:50:28 -0400 Subject: generic specialization design discussion In-Reply-To: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> Message-ID: <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> The slide deck contains a list of terminology. I?d like to posit that the most confusion-reducing thing we could do is come up with another word for value types/classes/instances, since the word ?value? is already used to describe primitives and references themselves. This is a good time to see if there are better names available. So for this thread only, we?re turning on the syntax light to discuss what might be a better name for the abstraction currently known as ?value classes?. > On Mar 29, 2019, at 12:08 PM, John Rose wrote: > > This week I gave some presentations of my current thinking > about specializations to people (from Oracle and IBM) gathered > in Burlington. Here it is FTR. If you read it you will find lots > of questions, as well as requirements and tentative answers. > > http://cr.openjdk.java.net/~jrose/pres/201903-TemplateDesign.pdf > > This is a checkpoint. I have more tentative answers on the > drawing board that didn't fit into the slide deck. Stay tuned. > > ? John From forax at univ-mlv.fr Mon Apr 8 17:18:13 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 8 Apr 2019 19:18:13 +0200 (CEST) Subject: Updated VM-bridges document In-Reply-To: References: <630462416.8392.1554672643213.JavaMail.zimbra@u-pem.fr> Message-ID: <691369778.302607.1554743893793.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "valhalla-spec-experts" > Envoy?: Lundi 8 Avril 2019 16:24:21 > Objet: Re: Updated VM-bridges document >> The other thing is that Forwarding bridge should not use an adapter but a >> bootstrap method. > Can you explain exactly what you mean here? Because in my mind, the adapter _is_ > a bootstrap method ? it is code to which the VM upcalls at preparation / link > time to help establish linkage. > Since you obviously have a more specific notion of ?bootstrap method?, can you > explain exactly what you mean? it's a kind of bootstrap method but with not with the same ceremony as the one used by indy or condy, i.e. no lookup and no constant bootstrap arguments (and no name and no descriptor), i propose to the same protocol as with indy or condy. R?mi From kevinb at google.com Mon Apr 8 18:25:24 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Mon, 8 Apr 2019 11:25:24 -0700 Subject: generic specialization design discussion In-Reply-To: <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> Message-ID: I'd suggest the name should in some way allude to the inline/compact/flat memory layout, because that is the distinguishing feature of *these new things* compared to anything else you can do in Java. And it is what people should be thinking about as they decide whether a new class should use this. On Mon, Apr 8, 2019 at 10:02 AM Brian Goetz wrote: > The slide deck contains a list of terminology. I?d like to posit that the > most confusion-reducing thing we could do is come up with another word for > value types/classes/instances, since the word ?value? is already used to > describe primitives and references themselves. This is a good time to see > if there are better names available. > > So for this thread only, we?re turning on the syntax light to discuss what > might be a better name for the abstraction currently known as ?value > classes?. > > > > > On Mar 29, 2019, at 12:08 PM, John Rose wrote: > > > > This week I gave some presentations of my current thinking > > about specializations to people (from Oracle and IBM) gathered > > in Burlington. Here it is FTR. If you read it you will find lots > > of questions, as well as requirements and tentative answers. > > > > http://cr.openjdk.java.net/~jrose/pres/201903-TemplateDesign.pdf > > > > This is a checkpoint. I have more tentative answers on the > > drawing board that didn't fit into the slide deck. Stay tuned. > > > > ? John > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From brian.goetz at oracle.com Mon Apr 8 18:39:03 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 8 Apr 2019 14:39:03 -0400 Subject: Updated VM-bridges document In-Reply-To: <691369778.302607.1554743893793.JavaMail.zimbra@u-pem.fr> References: <630462416.8392.1554672643213.JavaMail.zimbra@u-pem.fr> <691369778.302607.1554743893793.JavaMail.zimbra@u-pem.fr> Message-ID: <4BAD335E-F891-49E4-9992-79C9F5003C94@oracle.com> OK, I see what you?re getting at now. Yes, this is one of the implementation possibilities. I was mostly looking to validate the concepts before diving into the representational details. One key point is that the default case should be able to proceed with no bootstrap; a small set of adaptations handles the most important cases, and avoiding an upcall is probably pretty desirable if we can get away with it. But, given that, an index to a BSM entry is probably fine; it moves the representation of ?which arguments are adapted how? into a static argument list, which is probably the best place for it. One non-obvious point here is that the adaptation must work _in both directions_. If we are migrating Collection::size from returning int to returning long, not only do we want to widen the result implicitly when invoked by legacy callers, but we have to _narrow_ the result when overridden by legacy subclasses. > On Apr 8, 2019, at 1:18 PM, forax at univ-mlv.fr wrote: > > > > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "valhalla-spec-experts" > Envoy?: Lundi 8 Avril 2019 16:24:21 > Objet: Re: Updated VM-bridges document > > The other thing is that Forwarding bridge should not use an adapter but a bootstrap method. > > Can you explain exactly what you mean here? Because in my mind, the adapter _is_ a bootstrap method ? it is code to which the VM upcalls at preparation / link time to help establish linkage. > > Since you obviously have a more specific notion of ?bootstrap method?, can you explain exactly what you mean? > > it's a kind of bootstrap method but with not with the same ceremony as the one used by indy or condy, i.e. no lookup and no constant bootstrap arguments (and no name and no descriptor), > i propose to the same protocol as with indy or condy. > > R?mi > From brian.goetz at oracle.com Mon Apr 8 18:58:29 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 8 Apr 2019 14:58:29 -0400 Subject: generic specialization design discussion In-Reply-To: References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> Message-ID: <0468778F-F576-4DA3-A54B-C3DB89F2F6DB@oracle.com> Yes, that?s a promising direction. And this is surely the motivation why the C# folks picked ?struct?; they wanted to carry the connotation that this is a structure that is inlined. Problem is, the word ?struct? is already so heavily polluted by what it means in C. So perhaps something like: inline class V { ? } This says than a V can be inlined into things that contain a V ? other classes and arrays. It also kind of suggests that this thing has no intrinsic identity. A possible downside of this choice is that one might mistake it for meaning ?its methods are inlined?. Which is actually a little true, in that the methods are implicitly static and therefore more amenable to dynamic inlining. So that might actually be OK. Others? > On Apr 8, 2019, at 2:25 PM, Kevin Bourrillion wrote: > > I'd suggest the name should in some way allude to the inline/compact/flat memory layout, because that is the distinguishing feature of these new things compared to anything else you can do in Java. And it is what people should be thinking about as they decide whether a new class should use this. > > > On Mon, Apr 8, 2019 at 10:02 AM Brian Goetz > wrote: > The slide deck contains a list of terminology. I?d like to posit that the most confusion-reducing thing we could do is come up with another word for value types/classes/instances, since the word ?value? is already used to describe primitives and references themselves. This is a good time to see if there are better names available. > > So for this thread only, we?re turning on the syntax light to discuss what might be a better name for the abstraction currently known as ?value classes?. > > > > > On Mar 29, 2019, at 12:08 PM, John Rose > wrote: > > > > This week I gave some presentations of my current thinking > > about specializations to people (from Oracle and IBM) gathered > > in Burlington. Here it is FTR. If you read it you will find lots > > of questions, as well as requirements and tentative answers. > > > > http://cr.openjdk.java.net/~jrose/pres/201903-TemplateDesign.pdf > > > > This is a checkpoint. I have more tentative answers on the > > drawing board that didn't fit into the slide deck. Stay tuned. > > > > ? John > > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From brian.goetz at oracle.com Mon Apr 8 19:06:17 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 8 Apr 2019 15:06:17 -0400 Subject: generic specialization design discussion In-Reply-To: <0468778F-F576-4DA3-A54B-C3DB89F2F6DB@oracle.com> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <0468778F-F576-4DA3-A54B-C3DB89F2F6DB@oracle.com> Message-ID: <803A16CB-66AA-4C48-AEAB-82A527468DD8@oracle.com> A related issue is that we want a word for describing which types are routinely flattened in layouts, and specialized in generics (the criteria are the same). Currently: - References and nullable projections of values (V?) are erased and not flattened - Values (zero-default and null-default) are specializable and flattenable (This thread is for terminology; if you have questions about the above claims, make a new thread, or better, wait for the more detailed writeup explaining why this is.) We need words for these two things too. > On Apr 8, 2019, at 2:58 PM, Brian Goetz wrote: > > Yes, that?s a promising direction. And this is surely the motivation why the C# folks picked ?struct?; they wanted to carry the connotation that this is a structure that is inlined. Problem is, the word ?struct? is already so heavily polluted by what it means in C. So perhaps something like: > > inline class V { ? } > > This says than a V can be inlined into things that contain a V ? other classes and arrays. It also kind of suggests that this thing has no intrinsic identity. > > A possible downside of this choice is that one might mistake it for meaning ?its methods are inlined?. Which is actually a little true, in that the methods are implicitly static and therefore more amenable to dynamic inlining. So that might actually be OK. > > Others? > >> On Apr 8, 2019, at 2:25 PM, Kevin Bourrillion > wrote: >> >> I'd suggest the name should in some way allude to the inline/compact/flat memory layout, because that is the distinguishing feature of these new things compared to anything else you can do in Java. And it is what people should be thinking about as they decide whether a new class should use this. >> >> >> On Mon, Apr 8, 2019 at 10:02 AM Brian Goetz > wrote: >> The slide deck contains a list of terminology. I?d like to posit that the most confusion-reducing thing we could do is come up with another word for value types/classes/instances, since the word ?value? is already used to describe primitives and references themselves. This is a good time to see if there are better names available. >> >> So for this thread only, we?re turning on the syntax light to discuss what might be a better name for the abstraction currently known as ?value classes?. >> >> >> >> > On Mar 29, 2019, at 12:08 PM, John Rose > wrote: >> > >> > This week I gave some presentations of my current thinking >> > about specializations to people (from Oracle and IBM) gathered >> > in Burlington. Here it is FTR. If you read it you will find lots >> > of questions, as well as requirements and tentative answers. >> > >> > http://cr.openjdk.java.net/~jrose/pres/201903-TemplateDesign.pdf >> > >> > This is a checkpoint. I have more tentative answers on the >> > drawing board that didn't fit into the slide deck. Stay tuned. >> > >> > ? John >> >> >> >> -- >> Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From brian.goetz at oracle.com Mon Apr 8 19:58:38 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 8 Apr 2019 15:58:38 -0400 Subject: RefObject and ValObject Message-ID: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> We never reached consensus on how to surface Ref/ValObject. Here are some places we might want to use these type names: - Parameter types / variables: we might want to restrict the domain of a parameter or variable to only hold a reference, or a value: void m(RefObject ro) { ? } - Type bounds: we might want to restrict the instantiation of a generic class to only hold a reference (say, because we?re going to lock on it): class Foo { ? } - Dynamic tests: if locking on a value is to throw, there must be a reasonable idiom that users can use to detect lockability without just trying to lock: if (x instanceof RefObject) { synchronized(x) { ? } } - Ref- or Val-specific methods. This one is more vague, but its conceivable we may want methods on ValObject that are members of all values. There?s been three ways proposed (so far) that we might reflect these as top types: - RefObject and ValObject are (somewhat special) classes. We spell (at least in the class file) ?value class? as ?class X extends ValObject?. We implicitly rewrite reference classes at runtime that extend Object to extend RefObject instead. This has obvious pedagogical value, but there are some (small) risks of anomalies. - RefObject and ValObject are interfaces. We ensure that no class can implement both. (Open question whether an interface could extend one or the other, acting as an implicit constraint that it only be implemented by value classes or reference classes.). Harder to do things like put final implementations of wait/notify in ValObject, though maybe this isn?t of as much value as it would have been if we?d done this 25 years ago. - Split the difference; ValObject is a class, RefObject is an interface. Sounds weird at first, but acknowledges that we?re grafting this on to refs after the fact, and eliminates most of the obvious anomalies. No matter which way we go, we end up with an odd anomaly: ?new Object()? should yield an instance of RefObject, but we don?t want Object <: RefObject for obvious reasons. Its possible that ?new Object()? could result in an instance of a _species_ of Object that implement RefObject? but our theory of species doesn?t quite go there and it seems a little silly to add new requirements just for this. From brian.goetz at oracle.com Mon Apr 8 20:20:40 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 8 Apr 2019 16:20:40 -0400 Subject: generic specialization design discussion In-Reply-To: <803A16CB-66AA-4C48-AEAB-82A527468DD8@oracle.com> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <0468778F-F576-4DA3-A54B-C3DB89F2F6DB@oracle.com> <803A16CB-66AA-4C48-AEAB-82A527468DD8@oracle.com> Message-ID: <4F57837B-5072-4345-BED1-9791AEF65F3E@oracle.com> Refining this: - inline classes are inlinable (duh) - reference classes, interfaces, and nullable-projections of zero-default inline classes are not inlinable (We later use these to say: Instantiation of generics with non-inlinable types are erased.) That seems not terrible, except for the ?nullable projection of zero-default inline classes? part, which is both a mouthful and has ?inline? in it. Perhaps calling these something like ?null-adjoined types? (to reflect the fact that we?re cramming a null into a type whose value set doesn?t naturally contain null) makes that slightly better. So: - inline classes are inlinable - reference classes, interfaces, and null-adjoined types are not inlinable > On Apr 8, 2019, at 3:06 PM, Brian Goetz wrote: > > A related issue is that we want a word for describing which types are routinely flattened in layouts, and specialized in generics (the criteria are the same). Currently: > > - References and nullable projections of values (V?) are erased and not flattened > - Values (zero-default and null-default) are specializable and flattenable > > (This thread is for terminology; if you have questions about the above claims, make a new thread, or better, wait for the more detailed writeup explaining why this is.) > > We need words for these two things too. > >> On Apr 8, 2019, at 2:58 PM, Brian Goetz > wrote: >> >> Yes, that?s a promising direction. And this is surely the motivation why the C# folks picked ?struct?; they wanted to carry the connotation that this is a structure that is inlined. Problem is, the word ?struct? is already so heavily polluted by what it means in C. So perhaps something like: >> >> inline class V { ? } >> >> This says than a V can be inlined into things that contain a V ? other classes and arrays. It also kind of suggests that this thing has no intrinsic identity. >> >> A possible downside of this choice is that one might mistake it for meaning ?its methods are inlined?. Which is actually a little true, in that the methods are implicitly static and therefore more amenable to dynamic inlining. So that might actually be OK. >> >> Others? >> >>> On Apr 8, 2019, at 2:25 PM, Kevin Bourrillion > wrote: >>> >>> I'd suggest the name should in some way allude to the inline/compact/flat memory layout, because that is the distinguishing feature of these new things compared to anything else you can do in Java. And it is what people should be thinking about as they decide whether a new class should use this. >>> >>> >>> On Mon, Apr 8, 2019 at 10:02 AM Brian Goetz > wrote: >>> The slide deck contains a list of terminology. I?d like to posit that the most confusion-reducing thing we could do is come up with another word for value types/classes/instances, since the word ?value? is already used to describe primitives and references themselves. This is a good time to see if there are better names available. >>> >>> So for this thread only, we?re turning on the syntax light to discuss what might be a better name for the abstraction currently known as ?value classes?. >>> >>> >>> >>> > On Mar 29, 2019, at 12:08 PM, John Rose > wrote: >>> > >>> > This week I gave some presentations of my current thinking >>> > about specializations to people (from Oracle and IBM) gathered >>> > in Burlington. Here it is FTR. If you read it you will find lots >>> > of questions, as well as requirements and tentative answers. >>> > >>> > http://cr.openjdk.java.net/~jrose/pres/201903-TemplateDesign.pdf >>> > >>> > This is a checkpoint. I have more tentative answers on the >>> > drawing board that didn't fit into the slide deck. Stay tuned. >>> > >>> > ? John >>> >>> >>> >>> -- >>> Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com > From brian.goetz at oracle.com Mon Apr 8 20:24:38 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 8 Apr 2019 16:24:38 -0400 Subject: Fwd: RefObject and ValObject References: <96540543-41e3-e13a-8b83-58e019ea95bc@oracle.com> Message-ID: <06BEB9B3-42A9-4323-99C5-EFEDDDD549DF@oracle.com> Sergey pointed out this additional benefit of RefObject: code that really doesn?t want to deal with values, or pay any of the taxes that arise from the fact that values are Object, such as acmp overhead. > On 4/8/19 12:58 PM, Brian Goetz wrote: >> We never reached consensus on how to surface Ref/ValObject. >> >> Here are some places we might want to use these type names: >> >> - Parameter types / variables: we might want to restrict the domain of a parameter or variable to only hold a reference, or a value: >> >> void m(RefObject ro) { ? } >> >> - Type bounds: we might want to restrict the instantiation of a generic class to only hold a reference (say, because we?re going to lock on it): >> >> class Foo { ? } >> >> - Dynamic tests: if locking on a value is to throw, there must be a reasonable idiom that users can use to detect lockability without just trying to lock: >> >> if (x instanceof RefObject) { >> synchronized(x) { ? } >> } >> >> - Ref- or Val-specific methods. This one is more vague, but its conceivable we may want methods on ValObject that are members of all values. >> >> >> There?s been three ways proposed (so far) that we might reflect these as top types: >> >> - RefObject and ValObject are (somewhat special) classes. We spell (at least in the class file) ?value class? as ?class X extends ValObject?. We implicitly rewrite reference classes at runtime that extend Object to extend RefObject instead. This has obvious pedagogical value, but there are some (small) risks of anomalies. >> >> - RefObject and ValObject are interfaces. We ensure that no class can implement both. (Open question whether an interface could extend one or the other, acting as an implicit constraint that it only be implemented by value classes or reference classes.). Harder to do things like put final implementations of wait/notify in ValObject, though maybe this isn?t of as much value as it would have been if we?d done this 25 years ago. >> >> - Split the difference; ValObject is a class, RefObject is an interface. Sounds weird at first, but acknowledges that we?re grafting this on to refs after the fact, and eliminates most of the obvious anomalies. >> >> No matter which way we go, we end up with an odd anomaly: ?new Object()? should yield an instance of RefObject, but we don?t want Object <: RefObject for obvious reasons. Its possible that ?new Object()? could result in an instance of a _species_ of Object that implement RefObject? but our theory of species doesn?t quite go there and it seems a little silly to add new requirements just for this. >> >> >> From forax at univ-mlv.fr Mon Apr 8 20:39:45 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 8 Apr 2019 22:39:45 +0200 (CEST) Subject: generic specialization design discussion In-Reply-To: References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> Message-ID: <1278111608.332301.1554755985109.JavaMail.zimbra@u-pem.fr> > De: "Kevin Bourrillion" > ?: "Brian Goetz" > Cc: "valhalla-spec-experts" > Envoy?: Lundi 8 Avril 2019 20:25:24 > Objet: Re: generic specialization design discussion > I'd suggest the name should in some way allude to the inline/compact/flat memory > layout, because that is the distinguishing feature of these new things compared > to anything else you can do in Java. And it is what people should be thinking > about as they decide whether a new class should use this. immediate class ? R?mi > On Mon, Apr 8, 2019 at 10:02 AM Brian Goetz < [ mailto:brian.goetz at oracle.com | > brian.goetz at oracle.com ] > wrote: >> The slide deck contains a list of terminology. I?d like to posit that the most >> confusion-reducing thing we could do is come up with another word for value >> types/classes/instances, since the word ?value? is already used to describe >> primitives and references themselves. This is a good time to see if there are >> better names available. >> So for this thread only, we?re turning on the syntax light to discuss what might >> be a better name for the abstraction currently known as ?value classes?. >>> On Mar 29, 2019, at 12:08 PM, John Rose < [ mailto:john.r.rose at oracle.com | >> > john.r.rose at oracle.com ] > wrote: >> > This week I gave some presentations of my current thinking >> > about specializations to people (from Oracle and IBM) gathered >> > in Burlington. Here it is FTR. If you read it you will find lots >> > of questions, as well as requirements and tentative answers. >>> [ http://cr.openjdk.java.net/~jrose/pres/201903-TemplateDesign.pdf | >> > http://cr.openjdk.java.net/~jrose/pres/201903-TemplateDesign.pdf ] >> > This is a checkpoint. I have more tentative answers on the >> > drawing board that didn't fit into the slide deck. Stay tuned. >> > ? John > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | [ mailto:kevinb at google.com | > kevinb at google.com ] From brian.goetz at oracle.com Mon Apr 8 20:45:46 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 8 Apr 2019 16:45:46 -0400 Subject: generic specialization design discussion In-Reply-To: <4F57837B-5072-4345-BED1-9791AEF65F3E@oracle.com> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <0468778F-F576-4DA3-A54B-C3DB89F2F6DB@oracle.com> <803A16CB-66AA-4C48-AEAB-82A527468DD8@oracle.com> <4F57837B-5072-4345-BED1-9791AEF65F3E@oracle.com> Message-ID: <93B614A9-0075-490D-86D5-3CABED9BCE86@oracle.com> And the opposite of ?inline? is ?indirect?. ref, interface, and null-adjoined types are _indirect_. Indirect classes are passed by pointer, do not get flattened, get erased, are nullable. > On Apr 8, 2019, at 4:20 PM, Brian Goetz wrote: > > Refining this: > > - inline classes are inlinable (duh) > - reference classes, interfaces, and nullable-projections of zero-default inline classes are not inlinable > > (We later use these to say: Instantiation of generics with non-inlinable types are erased.) > > That seems not terrible, except for the ?nullable projection of zero-default inline classes? part, which is both a mouthful and has ?inline? in it. Perhaps calling these something like ?null-adjoined types? (to reflect the fact that we?re cramming a null into a type whose value set doesn?t naturally contain null) makes that slightly better. So: > > - inline classes are inlinable > - reference classes, interfaces, and null-adjoined types are not inlinable > > > > >> On Apr 8, 2019, at 3:06 PM, Brian Goetz > wrote: >> >> A related issue is that we want a word for describing which types are routinely flattened in layouts, and specialized in generics (the criteria are the same). Currently: >> >> - References and nullable projections of values (V?) are erased and not flattened >> - Values (zero-default and null-default) are specializable and flattenable >> >> (This thread is for terminology; if you have questions about the above claims, make a new thread, or better, wait for the more detailed writeup explaining why this is.) >> >> We need words for these two things too. >> >>> On Apr 8, 2019, at 2:58 PM, Brian Goetz > wrote: >>> >>> Yes, that?s a promising direction. And this is surely the motivation why the C# folks picked ?struct?; they wanted to carry the connotation that this is a structure that is inlined. Problem is, the word ?struct? is already so heavily polluted by what it means in C. So perhaps something like: >>> >>> inline class V { ? } >>> >>> This says than a V can be inlined into things that contain a V ? other classes and arrays. It also kind of suggests that this thing has no intrinsic identity. >>> >>> A possible downside of this choice is that one might mistake it for meaning ?its methods are inlined?. Which is actually a little true, in that the methods are implicitly static and therefore more amenable to dynamic inlining. So that might actually be OK. >>> >>> Others? >>> >>>> On Apr 8, 2019, at 2:25 PM, Kevin Bourrillion > wrote: >>>> >>>> I'd suggest the name should in some way allude to the inline/compact/flat memory layout, because that is the distinguishing feature of these new things compared to anything else you can do in Java. And it is what people should be thinking about as they decide whether a new class should use this. >>>> >>>> >>>> On Mon, Apr 8, 2019 at 10:02 AM Brian Goetz > wrote: >>>> The slide deck contains a list of terminology. I?d like to posit that the most confusion-reducing thing we could do is come up with another word for value types/classes/instances, since the word ?value? is already used to describe primitives and references themselves. This is a good time to see if there are better names available. >>>> >>>> So for this thread only, we?re turning on the syntax light to discuss what might be a better name for the abstraction currently known as ?value classes?. >>>> >>>> >>>> >>>> > On Mar 29, 2019, at 12:08 PM, John Rose > wrote: >>>> > >>>> > This week I gave some presentations of my current thinking >>>> > about specializations to people (from Oracle and IBM) gathered >>>> > in Burlington. Here it is FTR. If you read it you will find lots >>>> > of questions, as well as requirements and tentative answers. >>>> > >>>> > http://cr.openjdk.java.net/~jrose/pres/201903-TemplateDesign.pdf >>>> > >>>> > This is a checkpoint. I have more tentative answers on the >>>> > drawing board that didn't fit into the slide deck. Stay tuned. >>>> > >>>> > ? John >>>> >>>> >>>> >>>> -- >>>> Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com >> > From john.r.rose at oracle.com Mon Apr 8 21:37:37 2019 From: john.r.rose at oracle.com (John Rose) Date: Mon, 8 Apr 2019 14:37:37 -0700 Subject: RefObject and ValObject In-Reply-To: <06BEB9B3-42A9-4323-99C5-EFEDDDD549DF@oracle.com> References: <96540543-41e3-e13a-8b83-58e019ea95bc@oracle.com> <06BEB9B3-42A9-4323-99C5-EFEDDDD549DF@oracle.com> Message-ID: <5DCDCD35-7192-43C7-A273-B8556246D602@oracle.com> Yes, that's a nice one. On Apr 8, 2019, at 1:24 PM, Brian Goetz wrote: > > Sergey pointed out this additional benefit of RefObject: code that really doesn?t want to deal with values, or pay any of the taxes that arise from the fact that values are Object, such as acmp overhead. From john.r.rose at oracle.com Mon Apr 8 21:55:54 2019 From: john.r.rose at oracle.com (John Rose) Date: Mon, 8 Apr 2019 14:55:54 -0700 Subject: generic specialization design discussion In-Reply-To: <93B614A9-0075-490D-86D5-3CABED9BCE86@oracle.com> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <0468778F-F576-4DA3-A54B-C3DB89F2F6DB@oracle.com> <803A16CB-66AA-4C48-AEAB-82A527468DD8@oracle.com> <4F57837B-5072-4345-BED1-9791AEF65F3E@oracle.com> <93B614A9-0075-490D-86D5-3CABED9BCE86@oracle.com> Message-ID: <58D4293D-FE3B-4A42-8100-675212706E0B@oracle.com> I like this, FTR. Note also that inline classes *may* be passed by indirection, while indirect classes *must*. The inline guys have new capabilities, plus the old ones. Why would you *want* to represent an inline-capable class with an indirection? Good question. Any of the following can be a reason to do so: - use via the Object type or an interface (they are indirect) - signature compatibility with existing APIs (Optional, LDT) - atomic updates (volatile variables) - forced nullability of null-hostile classes (ZDV?) - permission to load the class later (breaking bootstrap cycles) - option to translate recursive value types (requires an indirection) - fine control over object layout (when you prefer sharing to flattening) - fine control over object API (when you prefer heap buffering to scalarization) These are all factors we take for granted with Java references today. They become optional with inline classes. An "inline class" would more precisely be an "inlinable class", but doesn't it feel overly pedantic to use the precise term? So I like "inline class", with the caveat that the first five minutes of training requires somebody to say something like, "An inline class can flattened in memory or scalarized across APIs. It can also be manipulated indirectly like any class." (Note that C++ does something vaguely similar: A function marked "inline" can be used indirectly, by taking its address. The "inline" modifier does not *forbid* indirect use.) ? John On Apr 8, 2019, at 1:45 PM, Brian Goetz wrote: > > And the opposite of ?inline? is ?indirect?. ref, interface, and null-adjoined types are _indirect_. Indirect classes are passed by pointer, do not get flattened, get erased, are nullable. > From forax at univ-mlv.fr Mon Apr 8 21:55:59 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 8 Apr 2019 23:55:59 +0200 (CEST) Subject: generic specialization design discussion In-Reply-To: <93B614A9-0075-490D-86D5-3CABED9BCE86@oracle.com> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <0468778F-F576-4DA3-A54B-C3DB89F2F6DB@oracle.com> <803A16CB-66AA-4C48-AEAB-82A527468DD8@oracle.com> <4F57837B-5072-4345-BED1-9791AEF65F3E@oracle.com> <93B614A9-0075-490D-86D5-3CABED9BCE86@oracle.com> Message-ID: <1655087098.346648.1554760559187.JavaMail.zimbra@u-pem.fr> in that case, direct class ? R?mi > De: "Brian Goetz" > ?: "Kevin Bourrillion" > Cc: "valhalla-spec-experts" > Envoy?: Lundi 8 Avril 2019 22:45:46 > Objet: Re: generic specialization design discussion > And the opposite of ?inline? is ?indirect?. ref, interface, and null-adjoined > types are _indirect_. Indirect classes are passed by pointer, do not get > flattened, get erased, are nullable. >> On Apr 8, 2019, at 4:20 PM, Brian Goetz < [ mailto:brian.goetz at oracle.com | >> brian.goetz at oracle.com ] > wrote: >> Refining this: >> - inline classes are inlinable (duh) >> - reference classes, interfaces, and nullable-projections of zero-default inline >> classes are not inlinable >> (We later use these to say: Instantiation of generics with non-inlinable types >> are erased.) >> That seems not terrible, except for the ?nullable projection of zero-default >> inline classes? part, which is both a mouthful and has ?inline? in it. Perhaps >> calling these something like ?null-adjoined types? (to reflect the fact that >> we?re cramming a null into a type whose value set doesn?t naturally contain >> null) makes that slightly better. So: >> - inline classes are inlinable >> - reference classes, interfaces, and null-adjoined types are not inlinable >>> On Apr 8, 2019, at 3:06 PM, Brian Goetz < [ mailto:brian.goetz at oracle.com | >>> brian.goetz at oracle.com ] > wrote: >>> A related issue is that we want a word for describing which types are routinely >>> flattened in layouts, and specialized in generics (the criteria are the same). >>> Currently: >>> - References and nullable projections of values (V?) are erased and not >>> flattened >>> - Values (zero-default and null-default) are specializable and flattenable >>> (This thread is for terminology; if you have questions about the above claims, >>> make a new thread, or better, wait for the more detailed writeup explaining why >>> this is.) >>> We need words for these two things too. >>>> On Apr 8, 2019, at 2:58 PM, Brian Goetz < [ mailto:brian.goetz at oracle.com | >>>> brian.goetz at oracle.com ] > wrote: >>>> Yes, that?s a promising direction. And this is surely the motivation why the C# >>>> folks picked ?struct?; they wanted to carry the connotation that this is a >>>> structure that is inlined. Problem is, the word ?struct? is already so heavily >>>> polluted by what it means in C. So perhaps something like: >>>> inline class V { ? } >>>> This says than a V can be inlined into things that contain a V ? other classes >>>> and arrays. It also kind of suggests that this thing has no intrinsic identity. >>>> A possible downside of this choice is that one might mistake it for meaning ?its >>>> methods are inlined?. Which is actually a little true, in that the methods are >>>> implicitly static and therefore more amenable to dynamic inlining. So that >>>> might actually be OK. >>>> Others? >>>>> On Apr 8, 2019, at 2:25 PM, Kevin Bourrillion < [ mailto:kevinb at google.com | >>>>> kevinb at google.com ] > wrote: >>>>> I'd suggest the name should in some way allude to the inline/compact/flat memory >>>>> layout, because that is the distinguishing feature of these new things compared >>>>> to anything else you can do in Java. And it is what people should be thinking >>>>> about as they decide whether a new class should use this. >>>>> On Mon, Apr 8, 2019 at 10:02 AM Brian Goetz < [ mailto:brian.goetz at oracle.com | >>>>> brian.goetz at oracle.com ] > wrote: >>>>>> The slide deck contains a list of terminology. I?d like to posit that the most >>>>>> confusion-reducing thing we could do is come up with another word for value >>>>>> types/classes/instances, since the word ?value? is already used to describe >>>>>> primitives and references themselves. This is a good time to see if there are >>>>>> better names available. >>>>>> So for this thread only, we?re turning on the syntax light to discuss what might >>>>>> be a better name for the abstraction currently known as ?value classes?. >>>>>>> On Mar 29, 2019, at 12:08 PM, John Rose < [ mailto:john.r.rose at oracle.com | >>>>>> > john.r.rose at oracle.com ] > wrote: >>>>>> > This week I gave some presentations of my current thinking >>>>>> > about specializations to people (from Oracle and IBM) gathered >>>>>> > in Burlington. Here it is FTR. If you read it you will find lots >>>>>> > of questions, as well as requirements and tentative answers. >>>>>>> [ http://cr.openjdk.java.net/~jrose/pres/201903-TemplateDesign.pdf | >>>>>> > http://cr.openjdk.java.net/~jrose/pres/201903-TemplateDesign.pdf ] >>>>>> > This is a checkpoint. I have more tentative answers on the >>>>>> > drawing board that didn't fit into the slide deck. Stay tuned. >>>>>> > ? John >>>>> -- >>>>> Kevin Bourrillion | Java Librarian | Google, Inc. | [ mailto:kevinb at google.com | >>>>> kevinb at google.com ] From john.r.rose at oracle.com Mon Apr 8 22:18:27 2019 From: john.r.rose at oracle.com (John Rose) Date: Mon, 8 Apr 2019 15:18:27 -0700 Subject: generic specialization design discussion In-Reply-To: <58D4293D-FE3B-4A42-8100-675212706E0B@oracle.com> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <0468778F-F576-4DA3-A54B-C3DB89F2F6DB@oracle.com> <803A16CB-66AA-4C48-AEAB-82A527468DD8@oracle.com> <4F57837B-5072-4345-BED1-9791AEF65F3E@oracle.com> <93B614A9-0075-490D-86D5-3CABED9BCE86@oracle.com> <58D4293D-FE3B-4A42-8100-675212706E0B@oracle.com> Message-ID: > ... > - use via the Object type or an interface (they are indirect) > - signature compatibility with existing APIs (Optional, LDT) > - atomic updates (volatile variables) > - forced nullability of null-hostile classes (ZDV?) > - permission to load the class later (breaking bootstrap cycles) > - option to translate recursive value types (requires an indirection) > - fine control over object layout (when you prefer sharing to flattening) > - fine control over object API (when you prefer heap buffering to scalarization) BTW, one reason I like "indirect" is that it is a relatively unused term in the JVMS, and so can be used to indicate the new distinction between new flat/scalar and old "just a pointer", a distinction which gives rise to so many subtle effects, such as those I just listed. Using the old words "reference" and "value" for this runs up against previous existing uses of those words in the JVMS. Brian and I toyed briefly with the idea of promoting the word "pointer" (from its mysterious source in NPE) to document the new distinction, and also the term "address" or "machine address". "Indirection" won out, for me at least, because it is less concrete. I also prefer term "indirection" because of its fame as a problem solver. It is proverbial that you can solve any problem by adding another indirection. (What kind of problem? See list above!) https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering So, in our world, the technical term "indirection" can mean "What we've always done in the JVM to solve various problems by adding abstraction to variables." In contrast to inlining, which is "What we now do in the JVM to solve other various problems by breaking abstraction in other variables." For the JVM an "indirect" variable will be a pointer, a machine address. (But perhaps a compressed one!) The JVM will hide the details, but the abstraction will be available to solve those problems which require "another indirection". From john.r.rose at oracle.com Mon Apr 8 22:44:21 2019 From: john.r.rose at oracle.com (John Rose) Date: Mon, 8 Apr 2019 15:44:21 -0700 Subject: generic specialization design discussion In-Reply-To: <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> Message-ID: <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> On Apr 8, 2019, at 9:50 AM, Brian Goetz wrote: > > The slide deck contains a list of terminology. FTR here's the relevant slide: > ?value object?, ?reference object? (also value or reference instance) > ?value class?, ?reference class? (also value or reference type) > ?interface class? (Object = honorary interface) > object, class, reference: non-specific (non-primitives; ref. can be null) > value: non-specific (includes all object references, null, all primitives) > name (class, member), descriptor (field or method) > resolution: a stable mapping from name to metadata (or error) > ?class template?, ?method template? (even ?field template?) > ?specialized class? (= "species" for short), ?specialized method? > generic parameter, hole (in template), variance (depends on hole) If we move terms around so "value" gets replaced by "inline", and "reference" by "indirect", only the first two lines are affected: ?inline object?, ?indirect object? (also inline or indirect instance) ?inline class?, ?indirect class? (also inline or indirect type) An inline object is inherently inline, even if it is (at the moment) being referenced by a physical indirection. An indirect object is *always* referenced by a physical indirection. I think we also want to point out (at least in the JVMS) that the inline/indirect distinction applies to variables as well as to objects and classes. ?inline variable?, ?indirect variable? (parameter, return value, element, field, local) There is a key bit of slippage here, where the terms don't always line up exactly, nor do they vary freely. You can have any of the three but not the fourth: an indirect variable of an indirect class an inline variable of an inline class an indirect variable of an inline class At this point, we can also point out that interface classes always have indirect variables. Also, V? can be an indirect type where V is an inline type. This slippage suggests that we might try for an extra term (or a distinct pair of terms parallel to indirect/inline) which can apply solely to variables, or to objects, or to classes. For example, adding "heap" instead of "indirect" for classes: ?inline object?, ?heap object? (also inline or heap instance) ?inline class?, ?heap class? "inline type", "indirect type" (heap classes always indirect types) ?inline variable?, ?indirect variable? (parameter, return value, element, field, local) Choices for variables: an indirect variable of a heap class an indirect variable of an inline class an inline variable of an inline class Heap classes are always indirect types. The natural type of an inline class is an inline type. But an inline class has (may have?) an associated indirect type. An interface class is always an indirect type. This exercise appears to show that "indirect" and "inline" are principally distinctions between variables and their types, and only secondarily distinctions between classes and their objects/instances. (Variables have more degrees of freedom than objects, because variables *view* objects, and a single object can have several views. We already saw this with volatile/final/regular and now also with inline/indirect.) It also shows that, if we pick a third term like "heap", it only applies to regular classes, as an antonym for "inline", and to regular objects in the same way. BTW, I think primitives could be thought of as inline types. But not inline classes, until we invent such things in LW200. From john.r.rose at oracle.com Mon Apr 8 22:57:24 2019 From: john.r.rose at oracle.com (John Rose) Date: Mon, 8 Apr 2019 15:57:24 -0700 Subject: generic specialization design discussion In-Reply-To: <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> Message-ID: <95FD7046-E818-468D-A947-983C75020553@oracle.com> On Apr 8, 2019, at 3:44 PM, John Rose wrote: > > It also shows that, if we pick a third term like > "heap", it only applies to regular classes, as an > antonym for "inline", and to regular objects in > the same way. P.S. In this example, a heap object is not one which is stored in the heap, but rather which is embodied in its own heap allocated block, with identity. Inline objects can be inlined into the heap, but they are still inline, no matter where they end up. An inline object buffered in its own heap block is not a heap object, because its value is independent of that particular block; it can be moved anywhere without losing any part of its value. There's nothing too special about the word "heap". It's just doing the job of marking a class or object for which the placement in the heap, with its own identity, is a key part of the definition of the object's value. So, ?identity object?, ?identity class? would be just as correct, and maybe less confusing. Or "top-level object"? Or "always-indirect object"? "Faraway object"? What we had before was "reference object", where "reference" as a noun means one thing, and as an adjective means something related but subtly different. From dl at cs.oswego.edu Mon Apr 8 23:50:24 2019 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 8 Apr 2019 19:50:24 -0400 Subject: generic specialization design discussion In-Reply-To: <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> Message-ID: <8234fa64-bd23-872d-b465-6ef838faf72e@cs.oswego.edu> On 4/8/19 6:44 PM, John Rose wrote: > If we move terms around so "value" gets replaced by "inline", > and "reference" by "indirect", only the first two lines are > affected: > > ?inline object?, ?indirect object? (also inline or indirect instance) > ?inline class?, ?indirect class? (also inline or indirect type) > It would be nice not to use new terms for old concepts (even if new to Java). For alternative terminology originally stemming from similar struggles to make such distinctions, see UML "Composition" vs "aggregation" (also vs "association"). Wikipedia has some definitions: https://en.wikipedia.org/wiki/Class_diagram#Aggregation The UML specs say more but behind wall at https://www.omg.org/spec/UML -Doug From forax at univ-mlv.fr Tue Apr 9 06:38:48 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 9 Apr 2019 08:38:48 +0200 (CEST) Subject: Updated VM-bridges document In-Reply-To: <4BAD335E-F891-49E4-9992-79C9F5003C94@oracle.com> References: <630462416.8392.1554672643213.JavaMail.zimbra@u-pem.fr> <691369778.302607.1554743893793.JavaMail.zimbra@u-pem.fr> <4BAD335E-F891-49E4-9992-79C9F5003C94@oracle.com> Message-ID: <1154574832.378215.1554791928007.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "valhalla-spec-experts" > Envoy?: Lundi 8 Avril 2019 20:39:03 > Objet: Re: Updated VM-bridges document > OK, I see what you?re getting at now. Yes, this is one of the implementation > possibilities. I was mostly looking to validate the concepts before diving into > the representational details. One key point is that the default case should be > able to proceed with no bootstrap; a small set of adaptations handles the most > important cases, and avoiding an upcall is probably pretty desirable if we can > get away with it. It's an optimization, i prefer the VM to recognize a specific BSM and don't upcall it because its semantics is well known, it has the same effect but it's an implementation detail and not something that need to figure in the VM spec. > But, given that, an index to a BSM entry is probably fine; it moves the > representation of ?which arguments are adapted how? into a static argument > list, which is probably the best place for it. > One non-obvious point here is that the adaptation must work _in both > directions_. If we are migrating Collection::size from returning int to > returning long, not only do we want to widen the result implicitly when invoked > by legacy callers, but we have to _narrow_ the result when overridden by legacy > subclasses. i agree, a peculiar BSM has to be able to do the migration in both ways from old to new and new to old. R?mi >> On Apr 8, 2019, at 1:18 PM, [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] >> wrote: >>> De: "Brian Goetz" < [ mailto:brian.goetz at oracle.com | brian.goetz at oracle.com ] > >>> ?: "Remi Forax" < [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] > >>> Cc: "valhalla-spec-experts" < [ mailto:valhalla-spec-experts at openjdk.java.net | >>> valhalla-spec-experts at openjdk.java.net ] > >>> Envoy?: Lundi 8 Avril 2019 16:24:21 >>> Objet: Re: Updated VM-bridges document >>>> The other thing is that Forwarding bridge should not use an adapter but a >>>> bootstrap method. >>> Can you explain exactly what you mean here? Because in my mind, the adapter _is_ >>> a bootstrap method ? it is code to which the VM upcalls at preparation / link >>> time to help establish linkage. >>> Since you obviously have a more specific notion of ?bootstrap method?, can you >>> explain exactly what you mean? >> it's a kind of bootstrap method but with not with the same ceremony as the one >> used by indy or condy, i.e. no lookup and no constant bootstrap arguments (and >> no name and no descriptor), >> i propose to the same protocol as with indy or condy. >> R?mi From brian.goetz at oracle.com Tue Apr 9 13:39:19 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 9 Apr 2019 09:39:19 -0400 Subject: Updated VM-bridges document In-Reply-To: <1154574832.378215.1554791928007.JavaMail.zimbra@u-pem.fr> References: <630462416.8392.1554672643213.JavaMail.zimbra@u-pem.fr> <691369778.302607.1554743893793.JavaMail.zimbra@u-pem.fr> <4BAD335E-F891-49E4-9992-79C9F5003C94@oracle.com> <1154574832.378215.1554791928007.JavaMail.zimbra@u-pem.fr> Message-ID: > OK, I see what you?re getting at now. Yes, this is one of the implementation possibilities. I was mostly looking to validate the concepts before diving into the representational details. One key point is that the default case should be able to proceed with no bootstrap; a small set of adaptations handles the most important cases, and avoiding an upcall is probably pretty desirable if we can get away with it. > > It's an optimization, i prefer the VM to recognize a specific BSM and don't upcall it because its semantics is well known, it has the same effect but it's an implementation detail and not something that need to figure in the VM spec. It?s an optimization, but not in the way you think. If the purity of the spec were the only concern, then the approach you lay out would make perfect sense. But, there are other engineering realities, and taking on the full cost of bootstrap upcalls at this particular place in the VM ? when we don?t have to yet ? may well be a significant (and not yet necessary) engineering effort. From forax at univ-mlv.fr Tue Apr 9 14:10:04 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 9 Apr 2019 16:10:04 +0200 (CEST) Subject: Updated VM-bridges document In-Reply-To: References: <630462416.8392.1554672643213.JavaMail.zimbra@u-pem.fr> <691369778.302607.1554743893793.JavaMail.zimbra@u-pem.fr> <4BAD335E-F891-49E4-9992-79C9F5003C94@oracle.com> <1154574832.378215.1554791928007.JavaMail.zimbra@u-pem.fr> Message-ID: <222777058.667937.1554819004059.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "valhalla-spec-experts" > Envoy?: Mardi 9 Avril 2019 15:39:19 > Objet: Re: Updated VM-bridges document >>> OK, I see what you?re getting at now. Yes, this is one of the implementation >>> possibilities. I was mostly looking to validate the concepts before diving into >>> the representational details. One key point is that the default case should be >>> able to proceed with no bootstrap; a small set of adaptations handles the most >>> important cases, and avoiding an upcall is probably pretty desirable if we can >>> get away with it. >> It's an optimization, i prefer the VM to recognize a specific BSM and don't >> upcall it because its semantics is well known, it has the same effect but it's >> an implementation detail and not something that need to figure in the VM spec. > It?s an optimization, but not in the way you think. If the purity of the spec > were the only concern, then the approach you lay out would make perfect sense. > But, there are other engineering realities, and taking on the full cost of > bootstrap upcalls at this particular place in the VM ? when we don?t have to > yet ? may well be a significant (and not yet necessary) engineering effort. It doesn't seem wise to directly calls the BSM when you are filling the vtable but you can always generate an adapter (the descriptors from and to are known statically) that will call the BSM and replace itself. A strawman implementation is to generate during the parsing of the classfile, for each VM bridge a Java method with a bytecodes that does an invokedynamic call, the invokedynamic BSM taking as parameter the VM bridge BSM. R?mi From karen.kinnear at oracle.com Tue Apr 9 16:51:05 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Tue, 9 Apr 2019 12:51:05 -0400 Subject: Valhalla EG minutes March 13, 2019 Message-ID: <0F51BBDF-0792-4F07-A184-C8013D978B9E@oracle.com> Attendees: Dan H, Tobi, Remi, Brian, Simms, Frederic, Karen corrections welcome (apologies for delay) I. Value Types user model Brian: Value types user model: V, V? (V? value set including null) replace .val and .box Don?t need all degrees of freedom. This is a simpler story, will capture in a follow-on writeup. Box model was tacky and the wrong mental model. Replace with a nullability modifier. Remi: did not particularly like val/box model. Brian: Feedback please: null-default (no zero in value set) vs. zero-default (no null in value set) Reference, null-default value, zero-default value nullability, zero ability, flattenability - get to pick 2: declaration and use site Remi: Like concept - pick your zero Brian: Class author does the bulk of the work: declaring null-default vs. zero-default no accidental zeros, not leak into APIs Remi: if array: if zero, can call method, if null, can not II. Substitutability Dan H concern performance for existing applications with changes to acmp bytecode with naive acmp extra checks - regressions for native code 2.5-4% (DayTrader7 - java EE app) a few optimizations to skip additional checks a bit more regression on substitutability than on returning false Brian: migration compatibility constraints: Karen: value types as subtypes of Object and Interfaces - have to handle those cases with acmp anyway Dan: suggest API - explore identity, substitutability, equals - and then bind to operations biggest concern: can?t opt out Brian: tension semantics vs. perf == was intended to be fasttrack for older hardware, pre-JIT - prior to calling .equals Remi: concern security encapsulation Dan: concern tree or graph: == could be slow based on complexity .equals implementations most start internally with == anyway Brian: What if users were just to call .equals? Dan: concerned performance pothole when not expected - not want to migrate to value types and lost performance if erased generics: expect .equals - using LIFE model Brian: == does not imply identity today, it works for primitives Kevin requested == be illegal and redirect to .equals maybe we picked the wrong default here can?t sell ?false? as approach, == is important to users (not so much bytecode underneath) concern: performance characteristics Dan: concernL how opt-in to substitutability Karen: substitutability vs. equals? i.e. user overridable? Dan: old code assumes LIFE Karen: erased generics - old searches - I don?t remember precise numbers, but actually maybe 60% used LIFE other concern: non generics - use of Object and Interface - which don?t use LIFE model Brian: With LWorld, existing code will be exposed to value types Dan: relying on == : primitives or identical - fast checks, old code not expecting substitutability checks performance concern Brian: correctness concern as well as performance Remi: 2 kinds of value types: e.g. Complex, Point - do not want a false return from == e.g. Cursor, Optional - want encapsulation Dan: Point, Complex - just want operating overloading? Brian: don?t want to change semantics if you cast something to a super type Dan: values performance depends on complexity of object graph Remi if override equals and hashcode, want substitutability Karen: propose LW2 EA binary - with a flag - and get feedback Remi: already sent example: if lambda is a value type, == return false -> lose performance trick Brian: Dan - is your concern performance of existing code or performance of new? Perhaps explore API points? Dan: looking for a way to opt-in to faster approach e.g. their .equals might be faster than substitutability costs for non-value types and existing code are biggest concern Remi: what if change existing bytecodes from == to .equals? Brian: difference between most code will work and all code will work Remi: LWorld not a good idea? Brian: migration was impossible before LWorld changed to moderately practical Remi: concerns: so slow performance and encapsulation - defer to user? Dan: operator overload -> defer to user? Brian: biggest concern == not reflexive: x != x Remi: Value Type NAN? Brian: NAN - we hate with floating point, ok because infrequent All - thank about null-default/zero-default and stuff built on top Dan: will go through this III. Specialized Generics - translation strategy Follow-on vm internal meeting to discuss class file format for prototyping for specialized generics, including background reading of template class proposal (November 2017 - John Rose). http://cr.openjdk.java.net/~jrose/values/template-classes.html Remi: concern: Templates are C++ like had issues with size of generated code like partial evaluation Brian: different from C++ templates lots of sharing, intelligent erasure, wildcards Remi: challenge if represent a Tuple by a LinkedList - add a new Link and create a new Type - great if erase, if not, one type per link Brian: migration path - erase reference types Will have a longer discussion when we have a proposal thanks, Karen From brian.goetz at oracle.com Tue Apr 9 17:04:36 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 9 Apr 2019 13:04:36 -0400 Subject: generic specialization design discussion In-Reply-To: <8234fa64-bd23-872d-b465-6ef838faf72e@cs.oswego.edu> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> <8234fa64-bd23-872d-b465-6ef838faf72e@cs.oswego.edu> Message-ID: <983278D5-7CF3-45AA-8FC1-51530C19641F@oracle.com> OK, let?s make this problem a little simpler. The question of terminology in the JVMS is harder, but we have a syntax decision to make at the source code level. So far its been proposed we replace ?value class? with inline class Foo { } In addition to liking the sound of it, I like that it is more ?modifer-y? than ?value?, meaning that it could conceivably be applied to other entities: inline record R(int a); inline enum Foo { A, B }; > On Apr 8, 2019, at 7:50 PM, Doug Lea
wrote: > > On 4/8/19 6:44 PM, John Rose wrote: > >> If we move terms around so "value" gets replaced by "inline", >> and "reference" by "indirect", only the first two lines are >> affected: >> >> ?inline object?, ?indirect object? (also inline or indirect instance) >> ?inline class?, ?indirect class? (also inline or indirect type) >> > > It would be nice not to use new terms for old concepts (even if new to > Java). For alternative terminology originally stemming from similar > struggles to make such distinctions, see UML "Composition" vs > "aggregation" (also vs "association"). Wikipedia has some definitions: > https://en.wikipedia.org/wiki/Class_diagram#Aggregation > > The UML specs say more but behind wall at https://www.omg.org/spec/UML > > -Doug > From john.r.rose at oracle.com Tue Apr 9 18:55:10 2019 From: john.r.rose at oracle.com (John Rose) Date: Tue, 9 Apr 2019 11:55:10 -0700 Subject: generic specialization design discussion In-Reply-To: <8234fa64-bd23-872d-b465-6ef838faf72e@cs.oswego.edu> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> <8234fa64-bd23-872d-b465-6ef838faf72e@cs.oswego.edu> Message-ID: <34D24401-0DB1-4B23-B670-1E89B7EFA4A7@oracle.com> On Apr 8, 2019, at 4:50 PM, Doug Lea
wrote: > > On 4/8/19 6:44 PM, John Rose wrote: > >> If we move terms around so "value" gets replaced by "inline", >> and "reference" by "indirect", only the first two lines are >> affected: >> >> ?inline object?, ?indirect object? (also inline or indirect instance) >> ?inline class?, ?indirect class? (also inline or indirect type) >> > > It would be nice not to use new terms for old concepts (even if new to > Java). FTR, I agree and that's why I like "inline" and "indirect". > For alternative terminology originally stemming from similar > struggles to make such distinctions, see UML "Composition" vs > "aggregation" (also vs "association"). Wikipedia has some definitions: > https://en.wikipedia.org/wiki/Class_diagram#Aggregation Quote from there: "Furthermore, there is hardly a difference between aggregations and associations during implementation?" That's the problem with UML: It's too abstract. It doesn't give any guidance about implementation (nor should it). But the JVM needs to set expectations about dynamics as well as semantics, which means there's some hinting about implementation. Hence "inline" not "aggregation" and "indirect" not "associated". UML-native distinctions like 1-1 vs. 1-N don't (AFAICS) help use describe this lower-level distinction. The deep reason for this is hidden in the JVMS word "same". Only a single JVM heap block is the "same object" as itself, if the object is a heap/indirect/regular object, which possesses a unique identity tracked from def to use. An inline object can be the "same object" as another inline object even if both occurrences of the "same object" have independent defs. UML has a provision for describing state changes, but gives less help in characterizing objects which have no state. It seems to me that UML allows any object to have state, potentially. The abstraction might not let you observe the state, but it might be there. If UML were more explicit about object equivalence (==) we could mine terms out of there. I'm not enough of a UML expert to find if or where such terms occur. Googling takes me to https://www.uml-diagrams.org/object.html where I find this definition of Object from UML 1.4.2: > An entity with a well defined boundary and identity that > encapsulates state and behavior. There's a more nuanced one in UML 2.5, but it's still not useful AFAICS: > An object is an individual [thing] with a state and relationships > to other objects. Looking at UML this little bit does suggest the modifier "stateless" (entailing "don't use UML on me"?) as an alternative to "value" and "inline". I prefer a positive term to a privative one, when possible, so "inline" is better for me than "stateless" ("that which has no state"). It's not like the UML experts are looking at the last two decades of our thrashing out the value model, and saying "yep, we wondered when you would get here". They are as stuck in the Smalltalk model as we were. ? John > The UML specs say more but behind wall at https://www.omg.org/spec/UML > > -Doug > From dl at cs.oswego.edu Tue Apr 9 19:31:34 2019 From: dl at cs.oswego.edu (Doug Lea) Date: Tue, 9 Apr 2019 15:31:34 -0400 Subject: generic specialization design discussion In-Reply-To: <983278D5-7CF3-45AA-8FC1-51530C19641F@oracle.com> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> <8234fa64-bd23-872d-b465-6ef838faf72e@cs.oswego.edu> <983278D5-7CF3-45AA-8FC1-51530C19641F@oracle.com> Message-ID: <56521b96-1b66-4b56-02b2-383b7e300207@cs.oswego.edu> On 4/9/19 1:04 PM, Brian Goetz wrote: > OK, let?s make this problem a little simpler. The question of terminology in the JVMS is harder, but we have a syntax decision to make at the source code level. So far its been proposed we replace ?value class? with > > inline class Foo { } > > In addition to liking the sound of it, I like that it is more ?modifer-y? than ?value?, meaning that it could conceivably be applied to other entities: > > inline record R(int a); > > inline enum Foo { A, B }; I had sworn not to have opinions about syntax, because my reactions are probably not typical, but "inline" seems to under-stress issues users should keep in mind. How about "internal"? internal class Foo(); internal record R(); -Doug From brian.goetz at oracle.com Tue Apr 9 19:34:44 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 9 Apr 2019 15:34:44 -0400 Subject: generic specialization design discussion In-Reply-To: <56521b96-1b66-4b56-02b2-383b7e300207@cs.oswego.edu> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> <8234fa64-bd23-872d-b465-6ef838faf72e@cs.oswego.edu> <983278D5-7CF3-45AA-8FC1-51530C19641F@oracle.com> <56521b96-1b66-4b56-02b2-383b7e300207@cs.oswego.edu> Message-ID: <42B9D410-8F46-4523-A502-7840A55B75DC@oracle.com> > I had sworn not to have opinions about syntax, because my reactions are > probably not typical, but "inline" seems to under-stress issues users > should keep in mind. How about "internal"? > > internal class Foo(); internal record R(); I think most users think ?internal? is associated with encapsulation, such as packages not exported by a module. But, let?s step back: what issues that users should keep in mind need additional stressing? From forax at univ-mlv.fr Tue Apr 9 19:36:28 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 9 Apr 2019 21:36:28 +0200 (CEST) Subject: generic specialization design discussion In-Reply-To: <983278D5-7CF3-45AA-8FC1-51530C19641F@oracle.com> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> <8234fa64-bd23-872d-b465-6ef838faf72e@cs.oswego.edu> <983278D5-7CF3-45AA-8FC1-51530C19641F@oracle.com> Message-ID: <842019274.758023.1554838588572.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Doug Lea"
> Cc: "valhalla-spec-experts" > Envoy?: Mardi 9 Avril 2019 19:04:36 > Objet: Re: generic specialization design discussion > OK, let?s make this problem a little simpler. The question of terminology in > the JVMS is harder, but we have a syntax decision to make at the source code > level. So far its been proposed we replace ?value class? with > > inline class Foo { } > > In addition to liking the sound of it, I like that it is more ?modifer-y? than > ?value?, meaning that it could conceivably be applied to other entities: > > inline record R(int a); > > inline enum Foo { A, B }; It's very Kotlinish. I like it more because it's not 'value' than for its name per se. I have started to dislike 'value' because it has moved the cursor too much to the primitive side (Complex) and less to the abstraction side (Optional). I think that choosing 'inline' is a step to the right direction. The right direction obviously being the one where we all acknowledge that == being the substitutability test should be opt-in :) R?mi > > > >> On Apr 8, 2019, at 7:50 PM, Doug Lea
wrote: >> >> On 4/8/19 6:44 PM, John Rose wrote: >> >>> If we move terms around so "value" gets replaced by "inline", >>> and "reference" by "indirect", only the first two lines are >>> affected: >>> >>> ?inline object?, ?indirect object? (also inline or indirect instance) >>> ?inline class?, ?indirect class? (also inline or indirect type) >>> >> >> It would be nice not to use new terms for old concepts (even if new to >> Java). For alternative terminology originally stemming from similar >> struggles to make such distinctions, see UML "Composition" vs >> "aggregation" (also vs "association"). Wikipedia has some definitions: >> https://en.wikipedia.org/wiki/Class_diagram#Aggregation >> >> The UML specs say more but behind wall at https://www.omg.org/spec/UML >> >> -Doug From john.r.rose at oracle.com Tue Apr 9 19:38:12 2019 From: john.r.rose at oracle.com (John Rose) Date: Tue, 9 Apr 2019 12:38:12 -0700 Subject: generic specialization design discussion In-Reply-To: <34D24401-0DB1-4B23-B670-1E89B7EFA4A7@oracle.com> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> <8234fa64-bd23-872d-b465-6ef838faf72e@cs.oswego.edu> <34D24401-0DB1-4B23-B670-1E89B7EFA4A7@oracle.com> Message-ID: On Apr 9, 2019, at 11:55 AM, John Rose wrote: > > It's not like the UML experts are looking at the last > two decades of our thrashing out the value model, > and saying "yep, we wondered when you would get > here". They are as stuck in the Smalltalk model as > we were. P.S. I did a quick scan for mentions of object identity in this available document: https://www.omg.org/spec/UML/2.5.1/PDF The term "identity" is assumed but not defined, which IMO is a hallmark of Smalltalk-era object modeling, the kind of model we are struggling to get at arm's length so we can wrap our arms around it. Here's a quote from page 151: > The effect property may be used to specify what happens to objects passed in or out of a Parameter. It does not apply to parameters typed by data types, because these do not have identity with which to detect changes. (Data types are like Java primitives or value types.) Page 459, on identity tests: > If an object is classified solely as an instance of one or more Classes, then testing whether it is the ?same object? as another object is based on the identity of the object, independent of the current values for its StructuralFeatures or any links in which it participates (see sub clause 11.4.2). Page 460 on class changes (e.g. CLOS change-class): > The identity of the input object is preserved, no behaviors are executed, and no default value expressions are evaluated. The newClassifiers replace existing classifiers in an atomic step, so that structural feature values and links are not lost during the reclassification when the oldClassifiers and newClassifiers have structural features and associations in common. If this paragraph were guarded by language saying "the input object must be of the Foo kind", then we could look for an anti-Foo in the spec. But it's not. One useful bit is the definition of "data type", on page 167. And this is where the UML folks have the best claim to asking us "what took you so long?". > 10.2.3.1 DataTypes > > A DataType is a kind of Classifier. DataType differs from Class in that instances of a DataType are identified only by their value. All instances of a DataType with the same value are considered to be equal instances. > > If a DataType has attributes (i.e., Properties owned by it and in its namespace) it is called a structured DataType. Instances of a structured DataType contain attribute values matching its attributes. Instances of a structured DataType are considered to be equal if and only if the structure is the same and the values of the corresponding attributes are equal. > > Unified Modeling Language 2.5.1 167 > > A DataType may be parameterized, bound, and used as TemplateParameters. As a bonus, we also have: > 10.2.3.2 Primitive Types > > A PrimitiveType defines a predefined DataType, without any substructure. A PrimitiveType may have algebra and operations defined outside of UML, for example, mathematically. The run-time instances of a PrimitiveType are values that correspond to mathematical elements defined outside of UML (for example, the Integers). So, if we want to follow UML, we could call value/inline classes something like "data type" classes or "structured data" classes. (Now I sympathize with C# structs.) We might have to give up on saying that "classes have instances which are objects" and similar things because UML makes a strong distinction between identity-free data types and identity-laden objects. One of our basic principles in Valhalla is "codes like a class". UML says "classifier" instead of "class", and seems to allow data types to have "classifiers", so that's OK. We'd have to give up our use of the term "object" or bend away from UML usage there, because our inline classes define structure data, not objects, in UML terms. Basically, UML policy is to first distinguish by-value structured data types from by-identity objects, but that's not our policy. We build everything from objects. Oddly, this doesn't contradict UML as a modeling facility, but where UML allows identity-sensitive operations on arbitrary objects, we have to say (a) such an operation is partial, and applies only to some objects, or (b) such an operation is interpreted (as == and hashcode) without reference to identity, for objects which lack identity. Maybe we could smuggle classifiers for NoIdentity and HasIdentity into a future version of UML, with appropriate bounds for UML's identity-sensitive operations? Then it could describe the structure we are building. ? John From maurizio.cimadamore at oracle.com Tue Apr 9 20:10:02 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 9 Apr 2019 21:10:02 +0100 Subject: generic specialization design discussion In-Reply-To: <983278D5-7CF3-45AA-8FC1-51530C19641F@oracle.com> References: <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> <8234fa64-bd23-872d-b465-6ef838faf72e@cs.oswego.edu> <983278D5-7CF3-45AA-8FC1-51530C19641F@oracle.com> Message-ID: <213d368c-5948-2385-a1b4-982d189b5fc9@oracle.com> On 09/04/2019 18:04, Brian Goetz wrote: > In addition to liking the sound of it, I like that it is more ?modifer-y? than ?value?, meaning that it could conceivably be applied to other entities: > > inline record R(int a); > > inline enum Foo { A, B }; > I like it too - especially because in C/C++ "inline" doesn't actually _force_ the compiler to do anything. So, I like the hint-y nature of this keyword and I think it brings front & center what this feature is about in a way that 'value' never really did (users asking about the difference between records and values is, I think, a proof of that particular failure). Maurizio From john.r.rose at oracle.com Tue Apr 9 21:06:19 2019 From: john.r.rose at oracle.com (John Rose) Date: Tue, 9 Apr 2019 14:06:19 -0700 Subject: generic specialization design discussion In-Reply-To: References: <213d368c-5948-2385-a1b4-982d189b5fc9@oracle.com> <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> <8234fa64-bd23-872d-b465-6ef838faf72e@cs.oswego.edu> <983278D5-7CF3-45AA-8FC1-51530C19641F@oracle.com> Message-ID: <1874D686-4E92-4888-9A54-3AF8CF37C916@oracle.com> On Apr 9, 2019, at 1:41 PM, Daniel Heidinga wrote: > > Riffing on the "inline" term and tying things back to the flattenable discussions - what about using "flat" as the keyword? > Not bad. More riffing: https://www.thesaurus.com/browse/flat also has "spread out" and "level". Etymologically, a single-layer structure is "simplex", as opposed to a folded "complex" structure. What's a class without identity? A "mere", "simple" class. https://www.thesaurus.com/browse/simple has a bunch of synonyms. https://www.thesaurus.com/browse/plain has "open" and "lucid" and "transparent". https://www.thesaurus.com/browse/featherweight has "lightweight", "agile", "nimble", "loose", "sheer" Regarding the ability to make the same object in many places, "uncopyrighted". https://www.thesaurus.com/browse/reproduce suggests "repeatable", "replicant". value class Foo { } inline class Foo { } flat class Foo { } simple class Foo { } simplex class Foo { } spread class Foo { } mere class Foo { } lightweight class Foo { } transparent class Foo { } nimble class Foo { } replicated class Foo { } From john.r.rose at oracle.com Tue Apr 9 22:03:45 2019 From: john.r.rose at oracle.com (John Rose) Date: Tue, 9 Apr 2019 15:03:45 -0700 Subject: generic specialization design discussion In-Reply-To: <1874D686-4E92-4888-9A54-3AF8CF37C916@oracle.com> References: <213d368c-5948-2385-a1b4-982d189b5fc9@oracle.com> <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> <8234fa64-bd23-872d-b465-6ef838faf72e@cs.oswego.edu> <983278D5-7CF3-45AA-8FC1-51530C19641F@oracle.com> <1874D686-4E92-4888-9A54-3AF8CF37C916@oracle.com> Message-ID: On Apr 9, 2019, at 2:06 PM, John Rose wrote: > > Not bad. More riffing: Another one: "immediate" instead of "inline". Connotation from assembly code is "stuck in the middle of something else, not a variable". Etymology is "nothing between the user and the object, no mediator". Regular objects have *object identity* as the mediating factor between the object and every user. https://www.thesaurus.com/browse/immediate and https://www.thesaurus.com/browse/near-at-hand also have "direct", "adjacent", "close", "near", "contiguous", "available", and many more. Vladimir I. suggests that an ideal keyword will suggest or imply immutability. "Immediate" does this, as well as suggesting that the thing is available (inline) close at hand. From dl at cs.oswego.edu Wed Apr 10 12:11:42 2019 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 10 Apr 2019 08:11:42 -0400 Subject: generic specialization design discussion In-Reply-To: References: <213d368c-5948-2385-a1b4-982d189b5fc9@oracle.com> <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> <8234fa64-bd23-872d-b465-6ef838faf72e@cs.oswego.edu> <983278D5-7CF3-45AA-8FC1-51530C19641F@oracle.com> <1874D686-4E92-4888-9A54-3AF8CF37C916@oracle.com> Message-ID: <401126eb-614e-7584-fe8f-ac239421df4d@cs.oswego.edu> On 4/9/19 6:03 PM, John Rose wrote: > > Another one: "immediate" instead of "inline". Plausible. Following your thoughts in another post, the main property that ought to clear to users is: Instances of X classes do not have independent identity. If the audience were GOF Design Patterns readers, X would be "flyweight" (https://en.wikipedia.org/wiki/Flyweight_pattern). If the audience were those familiar with run-time mechanics, X might be "interior" (such an instance is always at an offset of something else.) But maybe Brian is right and "inline" is good enough. -Doug From john.r.rose at oracle.com Wed Apr 10 18:24:49 2019 From: john.r.rose at oracle.com (John Rose) Date: Wed, 10 Apr 2019 11:24:49 -0700 Subject: multi-def values vs. security, elucidated and solved Message-ID: <4DA5D2DE-2DA7-46C2-B9FF-ABE57A8778B2@oracle.com> One recurrent question about inlined value types is whether they are less secure than regular object types. The question revolves around a scenario where an inlined value instance X functions as a security token, and the value of a private field of X (X.p) must be secured. In this scenario, the attacker creates a series of guesses G1, G2, ? which attempt to replicate the value X, substituting various guessed values for X.p (G1.p, G2.p, etc.). If the attacker finds a guess Gi where Gi==X, then the attacker has "unlocked" X by exposing the value of X.p, since it must be the same as Gi.p which the attacker has already guessed and now has confirmed. This attack scenario is relatively narrow because it requires that the possible values of X.p can be enumerated in the time the attacker has to perform the attack. The time order for this attack is thus O(N) where N is the number of possible values of X.p. (If X implements Comparable and X.p is a key in the comparison, then the attack can be performed in O(log N). This is often feasible where the O(N) attack is not.) Why is this not a problem with classic indirect objects (those which have identity)? Because the tool for comparing Gi with X, the == operator, immediately returns false for any of the Gi, since those were created by the attacker. (If X is a classic object which implements Comparable, then the attack is more feasible, even with classic objects, since the attacker can use the compareTo operation to bracket the X.p value between positive and negative results. This problem applies equally to classic indirect objects and inline value objects.) So classic indirect objects are highly resistant to equality tests against attacker-created indirect objects, because the equality test will fail unless the attacker compares X with X itself?which gives the attacker no new information. Meanwhile, inline value objects are not resistant to equality tests, so the guessing can eventually (in O(N) time) produce a match against X. In short, an exactly copy of the inline value object X can be forged (as a lucky Gi) by an independent party. Pulling back from the attack per se, we can observe that a classic indirect object possesses an identity thas is created at that object's defining site (a "new X" expression or bytecode). No other defining site in space or time will ever create the same identity. An inline value object V possesses no such identity, and, therefore, several defining sites (a "new V" expression or invokestatic bytecode) can end up creating the *same* value, over and over again. All occurrences of the same indirect object have the same defining site; they are all connected by a chain of data-flow from definition to use. Multiple occurrences of the same value may have *distinct* defining sites, *not* connected by chains of data flow. The first time the two copies of the same value come together might be when they are first compared. They will compare equal (if they are the same value), even though they came from different data-flow chains of definition to use (from two different definitions). This never happens for classic indirect objects. This difference between classic indirect and new inline types suggests a defense against the attack scenario proposed above. What if we could ask a value type to emulate the special property that a definition-to-use data-flow chain is the only way for one value (of a given type X) to be a copy of itself? Forging a series of guesses G1, G2, ? would then be impossible. In fact, this is readily done, and without damaging the other desirable properties of inline value types. Simply endow the type "X" with an extra private field "X.q" which is initialized (in the constructor of X) by the expression, "new Object()". This augmented version of X will (drum roll, please) possess a bona fide *object identity* which cannot be forged by an attacker. If you think about this, the status of the JVM's invisible object header takes on a new aspect, that of a *field* which carries the *object identity*, and is *inherited* from the type of all classic indirect objects. We have sometimes called this hypothetical type "RefObject". The idea here is that every classic indirect object inherited, from RefObject, an object identity, notionally stored in the object header. (Actually, it's the address of the object header which is used, but the point remains that if you have a header, you can derive an object identity from it, by taking its address.) Meanwhile, every inline value object does *not* have such a header. (Some of its many copies *may* have headers, but these headers are prevented from being significant.) So an instance of C <: RefObject *inherits* an object identity from RefObject. Meanwhile, an inline value instance X is not an instance of RefObject, and does *not* inherit the header nor the object identity. *But*, if the instance X wishes to acquire an object identity, it can do so by *aggregation* instead of *inheritance*. Et voila; the upgraded version of X has no header, but its object identity lives on, in the field X.q. Problem solved. Therefore, if an inline value object is going to be used as an unforgeable security token, and the author is worried about an object-forging attack, the attack can be headed off by adding an object identity *field*. There will be a cost in footprint, but the object will continue to possess all the other properties of inline values, including flattenability. Perhaps the author of the class is already including a classic indirect object reference X.a in the class definition. If that is the case, a quick "clone" operation in the constructor before setting X.a can smuggle in an object identity without an increase in footprint. I think these observations adequately answer the persistent security concern about forging inline value objects. And they also help us understand more deeply "what's in a value". ? John From forax at univ-mlv.fr Wed Apr 10 19:00:55 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 10 Apr 2019 21:00:55 +0200 (CEST) Subject: multi-def values vs. security, elucidated and solved In-Reply-To: <4DA5D2DE-2DA7-46C2-B9FF-ABE57A8778B2@oracle.com> References: <4DA5D2DE-2DA7-46C2-B9FF-ABE57A8778B2@oracle.com> Message-ID: <253106973.1144120.1554922855410.JavaMail.zimbra@u-pem.fr> There is another style of attack, inline classes are subject to tearing, so you can be able to forge an invalid value (with respect to the constructor checks) of an inline class from two other valid values. By example, import java.util.stream.IntStream; public @__value__ class SecurityToken { private final long id1; private final long id2; public SecurityToken(long id1, long id2) { if (id1 != id2) { throw new IllegalArgumentException(); } this.id1 = id1; this.id2 = id2; } public void check() { if (id1 != id2) { throw new IllegalStateException(); } } public static void main(String[] args) { var tokens = new SecurityToken[1]; IntStream.range(0, 2).forEach(id -> { new Thread(() -> { for(;;) { tokens[0].check(); tokens[0] = new SecurityToken(id, id); } }).start(); }); } } you can already do that with longs and double on 32 bits hardware but here the values are non mutable. R?mi ----- Mail original ----- > De: "John Rose" > ?: "valhalla-spec-experts" > Envoy?: Mercredi 10 Avril 2019 20:24:49 > Objet: multi-def values vs. security, elucidated and solved > One recurrent question about inlined value types is > whether they are less secure than regular object types. > > The question revolves around a scenario where an > inlined value instance X functions as a security token, > and the value of a private field of X (X.p) must be > secured. In this scenario, the attacker creates a > series of guesses G1, G2, ? which attempt to > replicate the value X, substituting various guessed > values for X.p (G1.p, G2.p, etc.). If the attacker > finds a guess Gi where Gi==X, then the attacker > has "unlocked" X by exposing the value of X.p, > since it must be the same as Gi.p which the attacker > has already guessed and now has confirmed. > > This attack scenario is relatively narrow because > it requires that the possible values of X.p can be > enumerated in the time the attacker has to perform > the attack. The time order for this attack is thus > O(N) where N is the number of possible values of > X.p. > > (If X implements Comparable and X.p is a key in the > comparison, then the attack can be performed in > O(log N). This is often feasible where the O(N) > attack is not.) > > Why is this not a problem with classic indirect > objects (those which have identity)? Because the > tool for comparing Gi with X, the == operator, > immediately returns false for any of the Gi, > since those were created by the attacker. > > (If X is a classic object which implements Comparable, > then the attack is more feasible, even with classic > objects, since the attacker can use the compareTo > operation to bracket the X.p value between positive > and negative results. This problem applies equally > to classic indirect objects and inline value objects.) > > So classic indirect objects are highly resistant to > equality tests against attacker-created indirect objects, > because the equality test will fail unless the attacker > compares X with X itself?which gives the attacker > no new information. > > Meanwhile, inline value objects are not resistant to > equality tests, so the guessing can eventually (in O(N) > time) produce a match against X. > > In short, an exactly copy of the inline value object X > can be forged (as a lucky Gi) by an independent party. > > Pulling back from the attack per se, we can observe > that a classic indirect object possesses an identity > thas is created at that object's defining site (a "new X" > expression or bytecode). No other defining site in > space or time will ever create the same identity. > > An inline value object V possesses no such identity, > and, therefore, several defining sites (a "new V" > expression or invokestatic bytecode) can end up > creating the *same* value, over and over again. > > All occurrences of the same indirect object have > the same defining site; they are all connected by > a chain of data-flow from definition to use. > Multiple occurrences of the same value may have > *distinct* defining sites, *not* connected by > chains of data flow. The first time the two copies > of the same value come together might be when > they are first compared. They will compare equal > (if they are the same value), even though they came > from different data-flow chains of definition to > use (from two different definitions). This never > happens for classic indirect objects. > > This difference between classic indirect and > new inline types suggests a defense against > the attack scenario proposed above. What if > we could ask a value type to emulate the special > property that a definition-to-use data-flow chain > is the only way for one value (of a given type X) > to be a copy of itself? Forging a series of guesses > G1, G2, ? would then be impossible. > > In fact, this is readily done, and without damaging > the other desirable properties of inline value types. > Simply endow the type "X" with an extra private > field "X.q" which is initialized (in the constructor > of X) by the expression, "new Object()". This > augmented version of X will (drum roll, please) > possess a bona fide *object identity* which cannot > be forged by an attacker. > > If you think about this, the status of the JVM's > invisible object header takes on a new aspect, > that of a *field* which carries the *object identity*, > and is *inherited* from the type of all classic > indirect objects. We have sometimes called this > hypothetical type "RefObject". The idea here is > that every classic indirect object inherited, > from RefObject, an object identity, notionally > stored in the object header. (Actually, it's the > address of the object header which is used, > but the point remains that if you have a header, > you can derive an object identity from it, by > taking its address.) Meanwhile, every inline > value object does *not* have such a header. > (Some of its many copies *may* have headers, > but these headers are prevented from being > significant.) So an instance of C <: RefObject > *inherits* an object identity from RefObject. > > Meanwhile, an inline value instance X is not > an instance of RefObject, and does *not* inherit > the header nor the object identity. *But*, > if the instance X wishes to acquire an object > identity, it can do so by *aggregation* instead > of *inheritance*. Et voila; the upgraded version > of X has no header, but its object identity lives > on, in the field X.q. Problem solved. > > Therefore, if an inline value object is going to > be used as an unforgeable security token, and > the author is worried about an object-forging > attack, the attack can be headed off by adding > an object identity *field*. There will be a cost > in footprint, but the object will continue to > possess all the other properties of inline values, > including flattenability. Perhaps the author of > the class is already including a classic indirect > object reference X.a in the class definition. > If that is the case, a quick "clone" operation in > the constructor before setting X.a can smuggle in > an object identity without an increase in footprint. > > I think these observations adequately answer the > persistent security concern about forging inline > value objects. And they also help us understand > more deeply "what's in a value". > > ? John From brian.goetz at oracle.com Wed Apr 10 19:17:37 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 10 Apr 2019 15:17:37 -0400 Subject: multi-def values vs. security, elucidated and solved In-Reply-To: <4DA5D2DE-2DA7-46C2-B9FF-ABE57A8778B2@oracle.com> References: <4DA5D2DE-2DA7-46C2-B9FF-ABE57A8778B2@oracle.com> Message-ID: <44244bf8-5459-b09f-ca11-1260bc2e425a@oracle.com> This is a fine technique for defending against such an attack (as is, don't publish constructors that would let callers create the Gi objects.)? And I'm fine saying "if you want protection, add it." I think Remi's concern is not that there is no defense, but that authors might not realize that defense is needed, and might forget to defend themselves.? (Or, a too-clever maintainer might put `inline` on a class that they don't realize is being used as a capability token.) On 4/10/2019 2:24 PM, John Rose wrote: > One recurrent question about inlined value types is > whether they are less secure than regular object types. > > The question revolves around a scenario where an > inlined value instance X functions as a security token, > and the value of a private field of X (X.p) must be > secured. In this scenario, the attacker creates a > series of guesses G1, G2, ? which attempt to > replicate the value X, substituting various guessed > values for X.p (G1.p, G2.p, etc.). If the attacker > finds a guess Gi where Gi==X, then the attacker > has "unlocked" X by exposing the value of X.p, > since it must be the same as Gi.p which the attacker > has already guessed and now has confirmed. > > This attack scenario is relatively narrow because > it requires that the possible values of X.p can be > enumerated in the time the attacker has to perform > the attack. The time order for this attack is thus > O(N) where N is the number of possible values of > X.p. > > (If X implements Comparable and X.p is a key in the > comparison, then the attack can be performed in > O(log N). This is often feasible where the O(N) > attack is not.) > > Why is this not a problem with classic indirect > objects (those which have identity)? Because the > tool for comparing Gi with X, the == operator, > immediately returns false for any of the Gi, > since those were created by the attacker. > > (If X is a classic object which implements Comparable, > then the attack is more feasible, even with classic > objects, since the attacker can use the compareTo > operation to bracket the X.p value between positive > and negative results. This problem applies equally > to classic indirect objects and inline value objects.) > > So classic indirect objects are highly resistant to > equality tests against attacker-created indirect objects, > because the equality test will fail unless the attacker > compares X with X itself?which gives the attacker > no new information. > > Meanwhile, inline value objects are not resistant to > equality tests, so the guessing can eventually (in O(N) > time) produce a match against X. > > In short, an exactly copy of the inline value object X > can be forged (as a lucky Gi) by an independent party. > > Pulling back from the attack per se, we can observe > that a classic indirect object possesses an identity > thas is created at that object's defining site (a "new X" > expression or bytecode). No other defining site in > space or time will ever create the same identity. > > An inline value object V possesses no such identity, > and, therefore, several defining sites (a "new V" > expression or invokestatic bytecode) can end up > creating the *same* value, over and over again. > > All occurrences of the same indirect object have > the same defining site; they are all connected by > a chain of data-flow from definition to use. > Multiple occurrences of the same value may have > *distinct* defining sites, *not* connected by > chains of data flow. The first time the two copies > of the same value come together might be when > they are first compared. They will compare equal > (if they are the same value), even though they came > from different data-flow chains of definition to > use (from two different definitions). This never > happens for classic indirect objects. > > This difference between classic indirect and > new inline types suggests a defense against > the attack scenario proposed above. What if > we could ask a value type to emulate the special > property that a definition-to-use data-flow chain > is the only way for one value (of a given type X) > to be a copy of itself? Forging a series of guesses > G1, G2, ? would then be impossible. > > In fact, this is readily done, and without damaging > the other desirable properties of inline value types. > Simply endow the type "X" with an extra private > field "X.q" which is initialized (in the constructor > of X) by the expression, "new Object()". This > augmented version of X will (drum roll, please) > possess a bona fide *object identity* which cannot > be forged by an attacker. > > If you think about this, the status of the JVM's > invisible object header takes on a new aspect, > that of a *field* which carries the *object identity*, > and is *inherited* from the type of all classic > indirect objects. We have sometimes called this > hypothetical type "RefObject". The idea here is > that every classic indirect object inherited, > from RefObject, an object identity, notionally > stored in the object header. (Actually, it's the > address of the object header which is used, > but the point remains that if you have a header, > you can derive an object identity from it, by > taking its address.) Meanwhile, every inline > value object does *not* have such a header. > (Some of its many copies *may* have headers, > but these headers are prevented from being > significant.) So an instance of C <: RefObject > *inherits* an object identity from RefObject. > > Meanwhile, an inline value instance X is not > an instance of RefObject, and does *not* inherit > the header nor the object identity. *But*, > if the instance X wishes to acquire an object > identity, it can do so by *aggregation* instead > of *inheritance*. Et voila; the upgraded version > of X has no header, but its object identity lives > on, in the field X.q. Problem solved. > > Therefore, if an inline value object is going to > be used as an unforgeable security token, and > the author is worried about an object-forging > attack, the attack can be headed off by adding > an object identity *field*. There will be a cost > in footprint, but the object will continue to > possess all the other properties of inline values, > including flattenability. Perhaps the author of > the class is already including a classic indirect > object reference X.a in the class definition. > If that is the case, a quick "clone" operation in > the constructor before setting X.a can smuggle in > an object identity without an increase in footprint. > > I think these observations adequately answer the > persistent security concern about forging inline > value objects. And they also help us understand > more deeply "what's in a value". > > ? John From karen.kinnear at oracle.com Wed Apr 10 20:27:30 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 10 Apr 2019 16:27:30 -0400 Subject: Updated VM-bridges document In-Reply-To: References: Message-ID: <3C1BFFE8-C200-4942-BBC0-4D95652C751E@oracle.com> Brian, Here is an example of when I believe we need to create a reverser - let me know if my assumptions don?t match yours. Migration step 1: author: Date -> LocalDateTime old class D { m(Date); } Migration step 2: change method declarer: new class D {m(LDT);} and javac creates a forwarder m(Date); -> Date->LDT/m(LDT); class E extends D { m(Date); } which now overrides the forwarder. We do not change class E. We do not recompile it (I don?t know what recompilation would do here?) old class ClientD invokevirtual D.m(Date) receiver:E Migration step 3: new class ClientD invokevirtual D.m(LDT) receiver:E resolution: finds D.m(LDT) selection: starts with E, there is no E.m(LDT) so call D.m(LDT) It is my belief that the expected behavior is that we want to invoke E.m(Date) with asType signature matching. To do that, I propose that if the vm detects overriding of a forwarder, that we need to generate a reverser: E.m(Date) overrides D.m(Date)// forwarder: Date->LDT/invoke D.m(LDT)/return conversion The reverser that we want would be E.m(LDT) overrides D.m(LDT) // reverser: LDT->Date/invoke E.m(Date)/return reverse conversion Corrections welcome, thanks, Karen > On Apr 4, 2019, at 8:33 AM, Brian Goetz wrote: > > At the BUR meeting, we discussed reshuffling the dependency graph to do forwarding+reversing bridges earlier, which has the effect of taking some pressure off of the descriptor language. Here?s an updated doc on forwarding-reversing bridges in the VM. > > I?ve dropped, for the time being, any discussion of replacing existing generic bridges with this mechanism; we can revisit that later if it makes sense. Instead, I?ve focused solely on the migration aspects. I?ve also dropped any mention of implementation strategy, and instead appealed to ?as if? behavior. > > > ## From Bridges to Forwarders > > In the Java 1.0 days, `javac` was little more than an "assembler" for > the classfile format, translating source code to bytecode in a mostly > 1:1 manner. And, we liked it that way; the more predictable the > translation scheme, the more effective the runtime optimizations. > Even the major upgrade of Java 5 didn't significantly affect the > transparency of the resulting classfiles. > > Over time, we've seen small divergences between the language model and > the classfile model, and each of these is a source of sharp edges. In > Java 1.1 the addition of inner classes, and the mismatch between the > accessibility model in the language and the JVM (the language treated > a nest as a single entity; the JVM treat nest members as separate > classes) required _access bridges_ (`access$000` methods), which have > been the source of various issues over the years. Twenty years later, > these methods were obviated by [_Nest-based Access Control_][jep181] > -- which represents the choice to align the VM model to the language > model, so these adaptation artifacts are no longer required. > > In Java 5, while we were able to keep the translation largely stable > and transparent through the use of erasure, there was one point of > misalignment; several situations (covariant overrides, instantiated > generic supertypes) could give rise to the situation where two or more > method descriptors -- which the JVM treats as distinct methods -- are > treated by the language as if they correspond to the same method. To > fool the VM, the compiler emits _bridge methods_ which forward > invocations from one signature to another. And, as often happens when > we try to fool the VM, it ultimately has its revenge. > > #### Example: covariant overrides > > Java 5 introduced the ability to override a method but to provide a > more specific return type. (Java 8 later extended this to bridges in > interfaces as well.) For example: > > ```{.java} > class Parent { > Object m() { ... } > } > > class Child extends Parent { > @Override > String m() { ... } > } > ``` > > `Parent` declares a method whose descriptor is `()Object`, and `Child` > declares a method with the same name whose descriptor is `()String`. > If we compiled this class in the obvious way, the method in `Child` > would not override the method in `Parent`, and anyone calling > `Parent.m()` would find themselves executing the wrong implementation. > > The compiler addresses this by providing an additional implementation > of `m()`, whose descriptor is `()Object` (an actual override), marked > with `ACC_SYNTHETIC` and `ACC_BRIDGE`, whose body invokes `m()String` > (with `invokevirtual`), redirecting calls to the right implementation. > > #### Example: generic substitution > > A similar situation arises when we have a generic substitution with a > superclass. For example: > > ```{.java} > interface Parent { > void m(T x); > } > > class Child extends Parent { > @Override > void m(String x) { ... } > } > ``` > > At the language level, it is clear that `Child::m` intends to override > `Parent::m`. But the descriptor of `Parent::m` is `(Object)V`, and > the descriptor of `Child::m` is `(String)V`, so again a bridge is > needed. > > Because the two signatures -- `m(Object)V` and `m(String)V` -- have > been "merged" in this manner, the compiler will prevent subclasses > from overriding the bridge signature, in order to maintain the > integrity of the bridging scheme. (The first time you encounter an > error message informing you of an illegal override in this situation, > it can be extremely confusing!) > > #### Anatomy of a bridge method > > The bridge methods that are generated by the compiler today operate by > _forwarding_. That is, a bridge method `m(X)` is always defined > relative to some other method `m(Y)`, and the body of a bridge method > pushes its arguments on the stack, adapting them (widening, casting, > boxing, etc) the arguments from X to Y, invoking `m(Y)` with > `invokevirtual`, and adapting the return type from Y to X, and > returning that. Because the bridge uses `invokevirtual`, it need only > be generated once, and invocations of the bridge may select a method > in a subclass. (The bridge is generated at the "highest" place in the > inheritance hierarchy where the need for a bridge is identified, > which may be a class or an interface.) > > #### Bridges are brittle > > Bridges can be brittle under separate compilation (and, there was a > nontrivial bug tail initially.) Separate compilation can move bridges > from where already-compiled code expects them to be to places it does > not expect them. This can cause the wrong method body to be invoked, > or can cause "bridge loops" (resulting in `StackOverflowError`). > (These anomalies disappear if the entire hierarchy is consistently > recompiled; they are solely an artifact of inconsistent separate > compilation.) > > The basic problem with bridge methods is that the language views the > two method descriptors as two faces of the same actual method, whereas > the JVM sees them as distinct methods. (And, reflection also has to > participate in the charade.) > > #### Limits of bridge methods > > Bridge methods have worked well enough for the uses to which we've put > them, but there are a number of desirable scenarios where bridge > methods ultimately run out of gas. These scenarios stem from various > forms of _migration_, and the desire to make these migrations > binary-compatible. > > The problem of migration arises both from language evolution (Valhalla > aims to enable compatible migrating from value-based classes to value > types, and from erased generics to specialized), as well as from the > ordinary evolution of libraries. > > An example of the "ordinary migration" problem is the replacement of > the old `Date` classes with `LocalDateTime` and friends. We can > easily add new the classes to the JDK, along with conversions to and > from the old types, but there are existing APIs that still deal in > `Date` -- and if we ever want to be able to deprecate the old > versions, we have to find a way to compatibly migrate APIs that deal > in `Date` to the new types. (The extreme form of this is the > "Collections 2.0" problem; we could surely write a new Collections > library, but when nearly every API deals in `List`, unless we can > migrate these away, what would be the point?) > > Migration scenarios like these pose two problems that bridge methods > cannot solve: > > - **Fields.** While we can often reroute method invocations with > bridges, we have no similar mechanism for fields. If a field > signature changes (whether due to changes in the translation > strategy, or changes in the API), there is no way to make this > binary-compatible. > - **Overrides.** Bridges allow us to reroute _invocations_ of > methods, but not _overrides_ of methods. If a method descriptor > in a non-final class changes, but has subclasses in a separate > maintenance domain that continue to use the old descriptor, what > is intended to be an override may accidentally become an overload, > or might override the bridge instead of the actual method. > > #### Wildcards and polymorphic fields > > A non-migration application for bridges that comes out of Valhalla is > _wildcards_. For a class `C` with a method `m(T)`, the wildcard > `C` (the class type) has an abstract method `m(Object)`, which > needs to be implemented by each species type. This is, effectively, a > bridge; the method `m(Object)` generated for the species adapts the > arguments and forwards to the "real" (`m(T)`) method. While this > could be implemented using straightforward code generation in the > static compiler, it may be preferable to treat this as a bridge as > well. > > More importantly, the same is true for fields; if `C` has a field > of type `T`, then the wildcard `C` will expose this field as if it > were of type `Object`. This cannot be implemented using > straightforward code generation in the static compiler (without > undermining the promise of migration compatibility.) > > ## Forwarding > > In this document, we attempt to learn from the history of bridges, and > create a new mechanism -- _forwarders_ -- that work with the JVM > instead of against it. This raises the level of expressivity of > classfiles and opens the possibility of greater laziness. It is > possible that traditional bridging scenarios can eventually be handled > by forwarders too, but for purposes of this document, we will focus > exclusively on the migration scenarios. > > A _forwarder_ is a non-abstract method that, instead of a `Code` > attribute, has a `Forwarding` attribute: > > ``` > Forwarding { > u2 name; > u4 length; > u2 forwardeeDescriptor; > } > ``` > > Let's assume that forwarders have the `ACC_FORWARDER` and > `ACC_SYNTHETIC` bits (in reality we will likely overload > `ACC_BRIDGE`). > > When compiling a method (concrete or abstract) that has been migrated > from an old descriptor to a new descriptor (such as migrating > `m(Object)V` to `m(String)V`), the compiler would generate an ordinary > method with the new descriptor, and a forwarder with the old > descriptor which forwarders to the new descriptor. This captures the > statement that there used to be a method called `m` with the old > descriptor, but it migrated to the new descriptor -- so that the JVM > can transparently adjust the behavior of clients and overriders that > were not aware of the migration. > > #### Invocation of forwarders > > Given a forwarder in a class with name `N` and descriptor `D` that > forwards to descriptor `E`, define `M` by: > > MethodHandle M = MethodHandles.lookup() > .findVirtual(thisClass, N, E); > > If the forwarder is _selected_ as the target of an `invokevirtual`, > the behavior should be _as if_ the caller invoked `M.asType(D)`, where > the arguments of `D` are adapted to their counterparts in `E`, and the > return type in `E` is adapted back to the return type in `D`. (We may > wish to reduce the set of built-in adaptations to a smaller set than > those implemented by `MethodHandle::asType`, for simplicity, based on > requirements.) > > Because forwarders exist for migration, we hope that over time, > callers will migrate from the old descriptor to the new, rendering > forwarders vestigial. As a result, we may wish to defer as much of > the bridge generation logic as possible to first-selection time. > > #### Forwarders for fields > > The forwarding strategy can be applied to fields as well. In this > case, the forwardee descriptor is that of a field descriptor, and the > behavior has the same semantics as adapting a target field accessor > method handle to the type of the bridge descriptor. (If the forwarder > field is static, then the field should be static too.) > > #### Overriding of forwarders > > Capturing forwarding information declaratively enables us to detect > when a class overrides a forwarder descriptor with a non-forwarder > (which indicates that the subclass is out of date with its supertypes) > and redirect the override to the actual method (with arguments and > return values adapted.) > > Given a forwarder in a class `A` with name `N` and descriptor `D` that > forwards to descriptor `E`, suppose a subclass `B` overrides the > forwarder with `N(D)`. Let `M` be the method handle that corresponds > to the `Code` attribute of `B.N(D)`. We would like it to behave as if > `B` had instead specified a method `N(E)`, whose `Code` attribute > corresponded to `M.asType(E)`. > > #### Additional adaptations > > The uses we anticipate for L100 all can be done with `asType()` > adaptations (in fact, with a subset of `asType()` adaptations). > However, if we wish to support user-provided migrations (such as > migrating libraries that use `Date` to `LocalDateTime`) or migrate > complex JDK APIs such as `Stream`, we may need to provide additional > adaptation logic in the `ForwardingBridge` attribute. Let's extend > the `Forwarding` attribute: > > ``` > Forwarding { > u2 name; > u4 length; > u2 forwardeeDescriptor; > u2 adapter; > } > ``` > > where `adaptor` is the constant pool index of a method handle whose > type is `(MethodHandle;MethodType;)MethodHandle;` (note that the > method handle for `MethodHandle::asType` has this shape). If > `adapter` is zero, we use the built-in adaptations; if it is nonzero, > we use the referred-to method handle to adapt between the forwarder > and forwardee descriptors (in both directions). > > #### Adaptation failures and limitations > > Whatever adaptations we are prepared to do between forwarder and > forwardee, we need to be prepared to do them in both directions; if a > method `m(int)` is migrated to `m(long)`, invocation arguments will be > adapted `int` to `long`, but if overridden, we'll do the reverse > adaptation on the (out of date) overrider `m(int)`. Given that most > adaptations are not between isomorphic domains, there will be cases in > one direction or the other that cannot be represented (`long` to > `int` is lossy; `Integer` to `int` can NPE; `Object` to `String` can > CCE.) > > Our guidance is that adaptations should form a projection/embedding > pair; this gives us the nice property that we can repeat adaptations > with impunity (if the first adaptation doesn't fail, adapting back and > back again is guaranteed to be an identity.) Even within this, > though, there are often multiple ways to implement the adaptation; an > embedding can throw on an out-of-range value, or it could pick an > in-range target and map to that. So, for example, if we migrated > `Collection::size` to return `long`, for `int`-desiring clients, we > could clamp values greater than `MAX_VALUE` to `MAX_VALUE`, rather > than throwing -- and this would likely be a better outcome for most > clients. The choice of adaptation should ultimately be left to > metadata present at the declaration of the migrated method. > > #### Type checking and corner cases > > A forwarder should always forward to a non-forwarder method (concrete > or abstract) _in the same class_. (Because they are in the same > class, there is no chance that separate compilation can cause a > forwarder to point to another forwarder.) > > In general, we expect that forwarders are only ever overridden by > non-forwarder methods (and then, only in out-of-date classfiles). > (This means that invocations that resolve to the forwarder will > generally select the forwarder.) > > - If a forwarder method is overridden by another forwarder method, > this is probably a result of a migration happening in a subclass > and then later the same migration happens in a superclass. We can > let the override proceed. > - If a forwarder is overridden by a legacy bridge, we have a few bad > choices. We could accept the bridge (which would interfere with > forwarding), or discard the bridge (which could cause other > anomalies.) If we leave existing bridge generation alone, this > case is unlikely and accepting the bridge is probably a reasonable > answer; if we migrate bridges to use forwarding, we'd probably > want to err in the other direction. > - If a forwarder has a forwardee descriptor that is exactly the > same as the forwarder, the forwarder should be discarded. (These > can arise from specialization situations.) > > > > > [jep181]: https://openjdk.java.net/jeps/181 > From brian.goetz at oracle.com Wed Apr 10 21:22:46 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 10 Apr 2019 17:22:46 -0400 Subject: Updated VM-bridges document In-Reply-To: <3C1BFFE8-C200-4942-BBC0-4D95652C751E@oracle.com> References: <3C1BFFE8-C200-4942-BBC0-4D95652C751E@oracle.com> Message-ID: OK, so in the old world, D has m(Date). > Migration step 1: author: Date -> LocalDateTime > > old class D { m(Date); } > Migration step 2: change method declarer: new class D {m(LDT);} > and javac creates a forwarder m(Date); -> Date->LDT/m(LDT); Now, D has m(LDT), with a forwarder from m(Date) -> m(LDT), with some sort of metadata stapled somewhere to effect the Date <--> LDT conversions. > class E extends D { m(Date); } which now overrides the forwarder. > We do not change class E. We do not recompile it (I don?t know what recompilation would do here?) On recompilation, we could do one of three things: 1.? Error: you're overriding a bridge, fix your program! 2.? Warning: you're overriding a bridge, I'll fix it for you (compiler adapts m(Date) to m(LDT). 3.? Warning: you're overriding a bridge, I'll believe you, and the VM will fix it for you (bringing us back to where you started: "we do not change E." Which we choose at compilation time doesn't really affect what the VM has to do (you still have to deal with the unrecompiled E), so we can make this decision later. > old class ClientD invokevirtual D.m(Date) receiver:E > Migration step 3: new class ClientD invokevirtual D.m(LDT) receiver:E > resolution: finds D.m(LDT) > selection: starts with E, there is no E.m(LDT) so call D.m(LDT) OK, so at this point, the classfiles that have been loaded look like: ??? class D { ??????? void m(LDT) { real method } ??????? @Forwarding(m(LDT)) abstract void m(Date); ??? } ??? class E extends D { ??????? @Override ??????? m(Date) { impl } ??? } So D has members m(LTD) and m(Date), the latter is a forwarder. Therefore E has the same members (instance methods are inherited). Here's how I would imagine this turns into in the VM: ??? class D { ??????? void m(LTD) { real method } ??????? void m(Date d) { m(adapt(d)); }? // generated forwarder ??? } ??? class E extends D { ??????? private void m$synthetic(Date d) { real method, body as present in classfile } ??????? void m(LTD ltd) { m$synthetic(adapt(ltd)); }? // generated reverser ??? } resolves selects invokevirtual D::m(LTD) D::m(LTD) E::m(LTD) invokevirtual D::m(Date) D::m(Date) D::m(Date), forwards to invvir D::m(LTD) In turn, selects E::m(LTD) invokevirtual E::m(LTD) E::m(LTD) E::m(LTD) invokevirtual E::m(Date) D::m(Date) D::m(Date), forwards to invvir D::m(LTD) In turn, selects E::m(LTD) In other words, we arrange that once the vtable is laid out, it is as if no one ever overrides the forwarders -- they only override the real method.? Hence the reverser is needed only where a class (like E) actually overrides a descriptor that corresponds to a forwarder. > It is my belief that the expected behavior is that we want to invoke E.m(Date) with asType signature matching. > To do that, I propose that if the vm detects overriding of a forwarder, that we need to generate a reverser: > > E.m(Date) overrides D.m(Date)// forwarder: Date->LDT/invoke D.m(LDT)/return conversion > > The reverser that we want would be > E.m(LDT) overrides D.m(LDT) // reverser: LDT->Date/invoke E.m(Date)/return reverse conversion I think we want: a reverser for E::m(LTD), but not for E::m(Date). Are we saying the same thing? From john.r.rose at oracle.com Wed Apr 10 21:55:16 2019 From: john.r.rose at oracle.com (John Rose) Date: Wed, 10 Apr 2019 14:55:16 -0700 Subject: multi-def values vs. security, elucidated and solved In-Reply-To: <44244bf8-5459-b09f-ca11-1260bc2e425a@oracle.com> References: <4DA5D2DE-2DA7-46C2-B9FF-ABE57A8778B2@oracle.com> <44244bf8-5459-b09f-ca11-1260bc2e425a@oracle.com> Message-ID: <598BEDFF-3F86-4FB6-A870-BB338572B9F2@oracle.com> On Apr 10, 2019, at 12:17 PM, Brian Goetz wrote: > > This is a fine technique for defending against such an attack (as is, don't publish constructors that would let callers create the Gi objects.) And I'm fine saying "if you want protection, add it." Good. > I think Remi's concern is not that there is no defense, (Well, there is no defense until we implement the def-site atomicity declaration.) > but that authors might not realize that defense is needed, and might forget to defend themselves. We need to add tearing to the various documents that security folks read. > (Or, a too-clever maintainer might put `inline` on a class that they don't realize is being used as a capability token.) This suggests that we should add the "non-inline" keyword, so that an author who has a good reason *not* to make a class inline, can advertise the decision in a checkable manner. ? John P.S. I have *often* wished for a "non-public" keyword; you can find my string "/*non-public*/" over 100 times in the code for java.lang.invoke classes, to prevent a too-clever maintainer from accidentally adding to a public API. (Likewise, "non-final" on variables which might otherwise be final. That would be a nice problem to have.) From karen.kinnear at oracle.com Thu Apr 11 19:20:40 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Thu, 11 Apr 2019 15:20:40 -0400 Subject: Valhalla EG notes April 10, 2019 Message-ID: <34DC8D74-2976-425F-BF50-7258B633091D@oracle.com> Attendees: Remi, Tobi, Dan H, John, Brian, Simms, Frederic, Karen AIs: 1. Remi - list of P3 bugs for condy rfes 2. Remi - Amber combinator suggestion 3. Karen - forwarder example for which we will need a reverser 4. editor note - I asked Frederic to forward an example of random flattening/performance impact if the vm chooses where to flatten for cycles I. Condy - requests for java support for several smaller features - condy lambda - condy enum switch - condy constant arrays - future: lazy static final II. Lazy static final Remi exploring in lworld 2 part init: getstatic, or if in same class - ldc locally experiments e.g. empty main - 100 static constants initialized in JDK and not yet read - will post (TIA!) III. Valhalla offsite follow-up Updated phasing email: http://mail.openjdk.java.net/pipermail/valhalla-dev/2019-April/005555.html 1. moved null-default and value-based class migration L10 -> L20 2. circularity handling for value type fields - proposed experiment with vm detection Remi: if VM determines where to ?stop? flattening the results will be random locations - which will change performance Karen: Frederic prototyping in progress - - choice of field to flatten is random: based on dynamic loading order John: give tools responsibility, so vm doesn?t make the decision, potential user model issue (ed. note: more discussion to come - including options such as not flattening any field involved in circularity/performance cost, tool choice, user model choice) 3. Meaning of ?L? descriptor John: ?L? descriptor is: ?by pointer?, nullable, not flattenable, not pre-load Karen: still under discussion how to indicate things like flattenable, null-free, ? 4. Migration support: ?forwarder? http://mail.openjdk.java.net/pipermail/valhalla-spec-experts/2019-April/000912.html Karen To enable value-based-class migration to null-default value types current proposal is to require source file changes with incremental benefits and to support migration via improvements in migration approaches - believe similar approach could be used to support allowing wildcards to work with species receivers Brian: new word ?forwarder? - not the same as existing bridges - classical bridges are not too horrible today (mostly shaken out) - new forwarder is actually bidirectional Remi: Dynamically typed language call from java: must match signature - could use forwarder, does not work today with bridges Brian: constraints on forwarders 1. same access controls 2. same name 3. local definition (vs. inherited) will discuss explicit trade-offs Remi: local definition not a big restriction - not too tight, not a big deal Brian: could lose later, need for LW20 Karen: propose: all access flags and attributes for field and method forwarders match forwardee (except ACC_BRIDGE/ACC_SYNTHETIC or whatever) Brian: must Abstract match? Will go through all bits and see which are important Remi: today generate method with indy in it in dynamically typed languages Brian: forwarding for client, may need to override forwarder Remi: overriding - no way (ed. note - ??) Brian: override original signatures - resolve to Karen: 1. AI: email example where overriding (which is only relevant on selection) means reverser needed 2. rather than forwarder refer to forwardeedescriptor - need to refer to either method_info or field_info (ed. note - this is NOT a methodref/fieldref - JVMS 4.6 method_info, JVMS 4.5 field_info, note that the forwardee has a code attribute as well as a descriptor_index) 3. need a pair of adaptors - so we can generate the reverser Brian: use asType in both directions Karen: performance concern - the vm is not calling MH.asType I get that for LW20, L->Q: don?t need the full set, we are just performing checkcasts we will want vm internal subset identified Remi: composition - may want several forwarders Brian: all in the same class file? yes, in future e.g. Date -> LocalDateTime (LDT) - yes will need multiple forwarders initial milestone limited set Karen: helpful to design with longer term goals concern user adaptors - tight restrictions Brian: restrict tightly to start Karen: could use a use case for multiple forwarders Brian: e.g. A -> B -> C javac may be able to unroll to the real forwardee - so may be able to avoid Remi: adapt different parameters Brian: yes, potential cross-product Remi: JIT - could inline adaptor Ok: with a 2 phase forwarder 1. L/Q for L20 2. for others :L100 5. Terminology bikeshed: Brian: name for ?value class?? Consensus from email on ?inline class? Remi: ok with ?inline? or ?immediate?, ?immediate? too long. Kotlin uses ?inline? already Dan H: concern: lose discussion history if rename value type to inline Brian: treat as GC?s term value type Dan H: must teach from scratch Brian: those who know will get it, 99% need to be taught anyway Karen: if we are revisiting terminology - can we have reference cover all classes (etc) and have a split? John: e.g. JVMS uses reference generally - e.g. all a* bytecodes What about ?indirect?? Brian: carrier thing - indirect carrier for identity vs. inline class 6. RefObject vs. ValObject (new names coming) Brian: coming around to interfaces then what is Object? if you say new Object - is it likely you want to be able to lock? Dan H: any runtime benefit to making interfaces? Brian: Not have to change superclass hierarchy Dan H: verifier benefit if superclass rather than superinterface - verifier doesn?t do interface subtype checking - left until runtime Brian: what if RefObject erases to Object? e.g. acmp issue - what if old code only wants classical Object - way to represent only RefObject without changing signature? - source RefObject -> binary Object ? Dan H: future compatibility bigger concern - old code only want RefObject - could enforce in signatures - JDK needs bridges - perhaps rest could get away without bridges? - prefer class rather than interact : interface runtime check is a performance cost John: even with erasure? vm has permission to perform strong speculation Karen: concern 1: ?magic? split with erasure proposal - same source generates two different binaries? - how do you know which the user meant? All Object or just RefObject? concern 2: if I understand the proposal - this would mean the authors of old code would be the ones that need to make a change better if new code needs to make a change, John: better with World, shunt off newcomers, especially arrays Karen: old code - work to ensure same performance John: array loop performance challenges Frederic: flattened arrays in the interpreter are worst case, need copies 7. MH combinators Remi: AI: email another potential combinator for Amber indy with initial arg which is constant: can get 10-20x performance (e.g. formatter) - need GuardwithTest: if constant do this, else do that John: asType - may not be quick correct - today it checks interfaces (does not honor the verifier convention) - also no null checks today (ed. note - I?m confused - runtime has to make up for verifier lack today - isn?t asType - runtime?) Karen: will forwarders include null checks? John: yes: primitive box/unbox maybe not checkcast if Interface<->Object // since verifier will let this through (ed. note checkcast would be there for the runtime) may want asBridgeType with null checks corrections welcome, thanks, Karen From brian.goetz at oracle.com Thu Apr 11 19:52:23 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 11 Apr 2019 15:52:23 -0400 Subject: Updated VM-bridges document In-Reply-To: References: Message-ID: <7dcb8de2-0437-d72b-38c1-f5fecba489b0@oracle.com> This was received through a side channel: > From: sebastian.sickelmann at gmx.de > Subject: Re: Updated VM-bridges document > > Hi, > i have a question regarding the discussed forwarding schema for fields. > > Should it be possible to forward field access to methods, so that > we can safely remove public fields from the jdk in a binary compatible > way? Some time ago I experimented[1] with such a feature but unfortunatly > i haven't the time to continue on this. > > There are some other things to solve when we map field access to method-calls. > One example is that we need a bootstraping-result for the put and the get I think. > It would be nice if "forwarders for fields" could solve thie issue of public fields > in jdk-api in future. > > -- > Sebastian > > [0]http://mail.openjdk.java.net/pipermail/discuss/2015-December/003863.html (last link to multple short threads) While the mechanism could surely be used for this (and more), I would prefer not to.? The two main challenges for which we need field forwarders _at all_ are for access to fields through the wildcard type, and for migrating fields from value-based classes to value types.? An example for wildcards: ??? class Foo { ??????? T t; ??? } The specialized type `Foo` will have a `t : int` field, but the wildcard type `Foo` must be seen to have a `t : Object` field.? But in reality, there needs to be one field.? This is the main challenge for which we need field forwarder support. Migrating from fields to methods is an attractive target, but with it comes a great risk for abuse: "lookie, here's properties."? And the reason I don't want to support this is that field access `t.f` has a known cost model (field access is fast) and known error model (throws few exceptions.)? The more that arbitrary code could be run at field access time, the more we undermine that.? So I want to restrict field bridges to: ?- Bridges for an _actual field only_ ?- Limited set of adaptations: cast/null-check only This is enough to support the wildcard case and the L->Q migration case.? I would strongly prefer to stop there. > #### Forwarders for fields > > The forwarding strategy can be applied to fields as well. In this > case, the forwardee descriptor is that of a field descriptor, and the > behavior has the same semantics as adapting a target field accessor > method handle to the type of the bridge descriptor. (If the forwarder > field is static, then the field should be static too.) > From frederic.parain at oracle.com Thu Apr 11 20:22:43 2019 From: frederic.parain at oracle.com (Frederic Parain) Date: Thu, 11 Apr 2019 16:22:43 -0400 Subject: Valhalla EG notes April 10, 2019 In-Reply-To: <34DC8D74-2976-425F-BF50-7258B633091D@oracle.com> References: <34DC8D74-2976-425F-BF50-7258B633091D@oracle.com> Message-ID: <1A970A7B-AD56-4827-AE2E-B19221B39CBB@oracle.com> > On Apr 11, 2019, at 15:20, Karen Kinnear wrote: > > 2. circularity handling for value type fields - proposed experiment with vm detection > Remi: if VM determines where to ?stop? flattening the results will be random locations - which will change performance > Karen: Frederic prototyping in progress - > - choice of field to flatten is random: based on dynamic loading order > John: give tools responsibility, so vm doesn?t make the decision, potential user model issue > (ed. note: more discussion to come - including options such as not flattening any field involved in circularity/performance cost, tool choice, user model choice) Here?s the results of the exploration: 1 - The JVM could be able to deal with cycles by stopping field flattening. The class loading and field layout computation have been updated to support cycles without major issues. CI was not fixed, it currently enters an infinite recursive loop, but after a discussion with Tobias, it seems that we should be able to handle that properly. 2 - The next question is what to do when a cycle is detected. The solution implemented in the prototype was to try to flattened as much as possible, and to refuse to flatten the last field closing the cycle. The problem with this strategy is that the layout of data structures depends on the first class of the cycle that is loaded. For the end user, this means that performance will depends on class loading order, something that the user doesn?t necessarily controls. Example with the test program attached to this mail. The argument controls execution of different branches which trigger class loading in different order. Then, whatever argument has been passed, the same loop is executed (runs are using the interpreter because of the CI issue, with a JIT, the differences should be less significant) : fparain-mac:valhalla fparain$ ./build/macosx-x64-debug/jdk/bin/java -XX:+EnableValhalla -Xint CycleTest A Average: 647.0 ops/ms fparain-mac:valhalla fparain$ ./build/macosx-x64-debug/jdk/bin/java -XX:+EnableValhalla -Xint CycleTest B Average: 890.0 ops/ms fparain-mac:valhalla fparain$ ./build/macosx-x64-debug/jdk/bin/java -XX:+EnableValhalla -Xint CycleTest C Average: 642.0 ops/ms And the explanation of the difference of throughput comes directly from the difference of layouts: With argument A: Class CycleTest$A [@app]: @ 16 "i" I @ 24 "b" QCycleTest$B; // flattenable and flattened @ 24 "j" I @ 32 "c" QCycleTest$C; // flattenable and flattened @ 32 "k" I @ 36 "a" QCycleTest$A; // flattenable not flattened With argument B: Class CycleTest$A [@app]: @ 16 "i" I @ 20 "b" QCycleTest$B; // flattenable not flattened With argument C: Class CycleTest$A [@app]: @ 16 "i" I @ 24 "b" QCycleTest$B; // flattenable and flattened @ 24 "j" I @ 28 "c" QCycleTest$C; // flattenable not flattened Dan suggested another strategy to ensure consistent layouts and performances: whenever a cycle is detected, non of the field involved in this cycle is flattened. We can implement this solution too. The other solutions that have been proposed rely on the user or javac to prevent the creation of cycles. These solutions don?t require modification of the JVM, it would keep its current behavior which is to throw a ClassCircularityError when it detects a cycle. Fred -------------- next part -------------- From brian.goetz at oracle.com Thu Apr 11 20:44:22 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 11 Apr 2019 16:44:22 -0400 Subject: Valhalla EG notes April 10, 2019 In-Reply-To: <1A970A7B-AD56-4827-AE2E-B19221B39CBB@oracle.com> References: <34DC8D74-2976-425F-BF50-7258B633091D@oracle.com> <1A970A7B-AD56-4827-AE2E-B19221B39CBB@oracle.com> Message-ID: <979091f0-b85c-8840-ffff-8e60ea1ae21c@oracle.com> To me, getting fancy here sounds like borrowing trouble; it seems much simpler -- and perfectly reasonable -- to reject cycles at both compile and runtime, and let users use `V?` in the place they want to break their cycles.? (Assuming we're comfortable with `V?` means not flattened, which is the choice we're also making for specialization.) On 4/11/2019 4:22 PM, Frederic Parain wrote: >> On Apr 11, 2019, at 15:20, Karen Kinnear wrote: >> >> 2. circularity handling for value type fields - proposed experiment with vm detection >> Remi: if VM determines where to ?stop? flattening the results will be random locations - which will change performance >> Karen: Frederic prototyping in progress - >> - choice of field to flatten is random: based on dynamic loading order >> John: give tools responsibility, so vm doesn?t make the decision, potential user model issue >> (ed. note: more discussion to come - including options such as not flattening any field involved in circularity/performance cost, tool choice, user model choice) > Here?s the results of the exploration: > 1 - The JVM could be able to deal with cycles by stopping field flattening. > The class loading and field layout computation have been updated to > support cycles without major issues. CI was not fixed, it currently enters > an infinite recursive loop, but after a discussion with Tobias, it seems that > we should be able to handle that properly. > > 2 - The next question is what to do when a cycle is detected. The solution > implemented in the prototype was to try to flattened as much as possible, > and to refuse to flatten the last field closing the cycle. The problem with this > strategy is that the layout of data structures depends on the first class of the cycle > that is loaded. For the end user, this means that performance will depends > on class loading order, something that the user doesn?t necessarily controls. > > Example with the test program attached to this mail. The argument controls > execution of different branches which trigger class loading in different order. > Then, whatever argument has been passed, the same loop is executed > (runs are using the interpreter because of the CI issue, with a JIT, the > differences should be less significant) : > > fparain-mac:valhalla fparain$ ./build/macosx-x64-debug/jdk/bin/java -XX:+EnableValhalla -Xint CycleTest A > Average: 647.0 ops/ms > fparain-mac:valhalla fparain$ ./build/macosx-x64-debug/jdk/bin/java -XX:+EnableValhalla -Xint CycleTest B > Average: 890.0 ops/ms > fparain-mac:valhalla fparain$ ./build/macosx-x64-debug/jdk/bin/java -XX:+EnableValhalla -Xint CycleTest C > Average: 642.0 ops/ms > > And the explanation of the difference of throughput comes directly from > the difference of layouts: > > With argument A: > > Class CycleTest$A [@app]: > @ 16 "i" I > @ 24 "b" QCycleTest$B; // flattenable and flattened > @ 24 "j" I > @ 32 "c" QCycleTest$C; // flattenable and flattened > @ 32 "k" I > @ 36 "a" QCycleTest$A; // flattenable not flattened > > With argument B: > > Class CycleTest$A [@app]: > @ 16 "i" I > @ 20 "b" QCycleTest$B; // flattenable not flattened > > With argument C: > > Class CycleTest$A [@app]: > @ 16 "i" I > @ 24 "b" QCycleTest$B; // flattenable and flattened > @ 24 "j" I > @ 28 "c" QCycleTest$C; // flattenable not flattened > > Dan suggested another strategy to ensure consistent layouts and performances: whenever a cycle > is detected, non of the field involved in this cycle is flattened. We can implement this solution too. > > The other solutions that have been proposed rely on the user or javac to prevent the creation of > cycles. These solutions don?t require modification of the JVM, it would keep its current behavior > which is to throw a ClassCircularityError when it detects a cycle. > > Fred > > > > > From forax at univ-mlv.fr Thu Apr 11 20:55:40 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 11 Apr 2019 22:55:40 +0200 (CEST) Subject: Updated VM-bridges document In-Reply-To: <7dcb8de2-0437-d72b-38c1-f5fecba489b0@oracle.com> References: <7dcb8de2-0437-d72b-38c1-f5fecba489b0@oracle.com> Message-ID: <1134367707.1445387.1555016140897.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "valhalla-spec-experts" > Envoy?: Jeudi 11 Avril 2019 21:52:23 > Objet: Re: Updated VM-bridges document > This was received through a side channel: >> From: [ mailto:sebastian.sickelmann at gmx.de | sebastian.sickelmann at gmx.de ] >> Subject: Re: Updated VM-bridges document >> Hi, >> i have a question regarding the discussed forwarding schema for fields. >> Should it be possible to forward field access to methods, so that >> we can safely remove public fields from the jdk in a binary compatible >> way? Some time ago I experimented[1] with such a feature but unfortunatly >> i haven't the time to continue on this. >> There are some other things to solve when we map field access to method-calls. >> One example is that we need a bootstraping-result for the put and the get I >> think. >> It would be nice if "forwarders for fields" could solve thie issue of public >> fields >> in jdk-api in future. >> -- >> Sebastian >> [0] [ http://mail.openjdk.java.net/pipermail/discuss/2015-December/003863.html | >> http://mail.openjdk.java.net/pipermail/discuss/2015-December/003863.html ] >> (last link to multple short threads) > While the mechanism could surely be used for this (and more), I would prefer not > to. The two main challenges for which we need field forwarders _at all_ are for > access to fields through the wildcard type, and for migrating fields from > value-based classes to value types. An example for wildcards: > class Foo { > T t; > } > The specialized type `Foo` will have a `t : int` field, but the wildcard > type `Foo` must be seen to have a `t : Object` field. But in reality, there > needs to be one field. This is the main challenge for which we need field > forwarder support. > Migrating from fields to methods is an attractive target, but with it comes a > great risk for abuse: "lookie, here's properties." And the reason I don't want > to support this is that field access `t.f` has a known cost model (field access > is fast) and known error model (throws few exceptions.) The more that arbitrary > code could be run at field access time, the more we undermine that. So I want > to restrict field bridges to: > - Bridges for an _actual field only_ > - Limited set of adaptations: cast/null-check only > This is enough to support the wildcard case and the L->Q migration case. I would > strongly prefer to stop there. Also i don't see how you can support the VarHandle API if a field is actually forwarded to methods, or forwarded to methods means that you have to re-implement all the VarHandle API (at least, get_volatile, set_volatile and CAS) when you want to forward a field ? R?mi >> #### Forwarders for fields >> The forwarding strategy can be applied to fields as well. In this >> case, the forwardee descriptor is that of a field descriptor, and the >> behavior has the same semantics as adapting a target field accessor >> method handle to the type of the bridge descriptor. (If the forwarder >> field is static, then the field should be static too.) From karen.kinnear at oracle.com Thu Apr 11 21:18:54 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Thu, 11 Apr 2019 17:18:54 -0400 Subject: Updated VM-bridges document In-Reply-To: References: <3C1BFFE8-C200-4942-BBC0-4D95652C751E@oracle.com> Message-ID: <9DBFB771-66F2-41C2-9D50-A55B6194DF10@oracle.com> > On Apr 10, 2019, at 5:22 PM, Brian Goetz wrote: > > OK, so in the old world, D has m(Date). > > > Now, D has m(LDT), with a forwarder from m(Date) -> m(LDT), with some sort of metadata stapled somewhere to effect the Date <--> LDT conversions. > >> class E extends D { m(Date); } which now overrides the forwarder. >> We do not change class E. We do not recompile it > >> old class ClientD invokevirtual D.m(Date) receiver:E >> Migration step 3: new class ClientD invokevirtual D.m(LDT) receiver:E >> resolution: finds D.m(LDT) >> selection: starts with E, there is no E.m(LDT) so call D.m(LDT) (LDT is LocalDateTime) > > OK, so at this point, the classfiles that have been loaded look like: > > class D { > void m(LDT) { real method } > @Forwarding(m(LDT)) abstract void m(Date); > } > > class E extends D { > @Override > m(Date) { impl } > } > > So D has members m(LTD) and m(Date), the latter is a forwarder. Therefore E has the same members (instance methods are inherited). From a source perspective, E has the same names of members, although it has overridden the contents of m(Date). > > Here's how I would imagine this turns into in the VM: not important, but this was m(LDT) not m(LTD) > > class D { > void m(LTD) { real method } > void m(Date d) { m(adapt(d)); } // generated forwarder > } > > class E extends D { > private void m$synthetic(Date d) { real method, body as present in classfile } I would expect that the existing m(Date) with the real method would stay unchanged - including the name and the access controls - since there may be clients of subclass E still trying to invoke it. > void m(LTD ltd) { m$synthetic(adapt(ltd)); } // generated reverser I think we are in agreement that there is a reverser: void m(LDT) generated receiver: 1) adapt LDT -> Date 2) invoke local m(Date) 3) if return had changed, adapt back // adaptations for reverser are the inverse as for the forwarder > } > > > resolves > selects > invokevirtual D::m(LTD) > D::m(LTD) > E::m(LTD) > invokevirtual D::m(Date) > D::m(Date) > D::m(Date), forwards to invvir D::m(LTD) > In turn, selects E::m(LTD) > invokevirtual E::m(LTD) > E::m(LTD) > E::m(LTD) > invokevirtual E::m(Date) > D::m(Date) > D::m(Date), forwards to invvir D::m(LTD) > In turn, selects E::m(LTD) > In other words, we arrange that once the vtable is laid out, it is as if no one ever overrides the forwarders -- they only override the real method. Hence the reverser is needed only where a class (like E) actually overrides a descriptor that corresponds to a forwarder. A VM perspective: invocation dynamic receiver resolution NOT invoked selection: actual execution invokevirtual D::m(LDT) D D.m(LDT) D.m(LDT) invokevirtual D::m(LDT) E D.m(LDT) E.m(LDT) reverser: adapt LDT->Date invoke local E.m(Date) if return had changed, adapt return back invokevirtual D::m(Date) D D.m(Date) D.m(Date) forwarder: adapt Date->LDT invoke local m(LDT) if return had changed, adapt invokevirtual D.m(Date) E D.m(Date) E.m(Date) invokevirtual E.m(LDT) E E.m(LDT) reverser) E.m(LDT): reverser: adapt LDT->Date invoke local E.m(Date) if return had changed, adapt return back invokevirtual E.m(Date) E E.m(Date) E.m(Date) // original - unchanged behavior Point 1: The resolved method is NOT invoked, it is only the selected method that is invoked. We do NOT follow forwarding for the resolved method. If the resolved method happens to also be the selected method, we will now execute it and will follow the forwarding. Note, the same applies to fields - we will not get/set the resolved field. We will get/set the selected field, and follow the forwarding at that point. Point 2: Hotspot?s vtable implementation is set up so that for class E - a vtable (or itable) is a selection cache. It allows for fast virtual dispatch. For Hotspot, for class E, the vtable starts with the inherited vtable from superclass D. Any entries in the table are replaced when a method overrides an inherited method. Additional methods are appended. So resolution gives you the offset in the vtable. Selection tells you which vtable owner to index based on that offset. We KEEP the existing methods in the subclass so that they are executed exactly the same with no change in behavior (no exceptions due to narrowing etc.) Agree that so far the only reverser need I have identified is when a class overrides a forwarder. > >> It is my belief that the expected behavior is that we want to invoke E.m(Date) with asType signature matching. >> To do that, I propose that if the vm detects overriding of a forwarder, that we need to generate a reverser: >> >> E.m(Date) overrides D.m(Date)// forwarder: Date->LDT/invoke D.m(LDT)/return conversion >> >> The reverser that we want would be >> E.m(LDT) overrides D.m(LDT) // reverser: LDT->Date/invoke E.m(Date)/return reverse conversion > > I think we want: a reverser for E::m(LTD), but not for E::m(Date). Are we saying the same thing? I think so on this sentence - we already have E::m(Date) overriding forwarder D::m(Date) so we need a reverser E::m(LDT) to override the forwardee and reverse to call E::m(Date) with the reverse adaptations. thanks, Karen > From brian.goetz at oracle.com Thu Apr 11 23:04:15 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 11 Apr 2019 19:04:15 -0400 Subject: Updated VM-bridges document In-Reply-To: <9DBFB771-66F2-41C2-9D50-A55B6194DF10@oracle.com> References: <3C1BFFE8-C200-4942-BBC0-4D95652C751E@oracle.com> <9DBFB771-66F2-41C2-9D50-A55B6194DF10@oracle.com> Message-ID: On 4/11/2019 5:18 PM, Karen Kinnear wrote: >> >> OK, so at this point, the classfiles that have been loaded look like: >> >> ??? class D { >> ??????? void m(LDT) { real method } >> ??????? @Forwarding(m(LDT)) abstract void m(Date); >> ??? } >> >> ??? class E extends D { >> ??????? @Override >> ??????? m(Date) { impl } >> ??? } >> >> So D has members m(LTD) and m(Date), the latter is a forwarder.? >> Therefore E has the same members (instance methods are inherited). > From a source perspective, E has the same names of members, although > it has overridden the contents of m(Date). > >> >> Here's how I would imagine this turns into in the VM: > not important, but this was m(LDT) not m(LTD) >> >> ??? class D { >> ??????? void m(LTD) { real method } >> ??????? void m(Date d) { m(adapt(d)); }? // generated forwarder >> ??? } >> >> ??? class E extends D { >> ??????? private void m$synthetic(Date d) { real method, body as >> present in classfile } > I would expect that the existing m(Date) with the real method would > stay unchanged - including > the name and the access controls - since there may be clients of > subclass E still trying to invoke it. I think this is our point of disconnect. The subclass has overridden a forwarder.? What we want to do is "heal the rift" by rewriting the subclass as if it had _only_ overridden the real method.? Hence, the "shunt it off to a synthetic" and create an overriding reverser that overrides the real method, adapting args/return, which delegates to the shuntee. If we left m(Date) in E, then this would be overriding the forwarder, effectively un-doing the effect of forwarding. Note that this is all "as if"; there are a hundred ways to _actually_ do it. From forax at univ-mlv.fr Fri Apr 12 07:02:23 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 12 Apr 2019 09:02:23 +0200 (CEST) Subject: Updated VM-bridges document In-Reply-To: References: <3C1BFFE8-C200-4942-BBC0-4D95652C751E@oracle.com> <9DBFB771-66F2-41C2-9D50-A55B6194DF10@oracle.com> Message-ID: <1015395595.1489546.1555052543852.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Karen Kinnear" > Cc: "valhalla-spec-experts" > Envoy?: Vendredi 12 Avril 2019 01:04:15 > Objet: Re: Updated VM-bridges document > On 4/11/2019 5:18 PM, Karen Kinnear wrote: >>> >>> OK, so at this point, the classfiles that have been loaded look like: >>> >>> ??? class D { >>> ??????? void m(LDT) { real method } >>> ??????? @Forwarding(m(LDT)) abstract void m(Date); >>> ??? } >>> >>> ??? class E extends D { >>> ??????? @Override >>> ??????? m(Date) { impl } >>> ??? } >>> >>> So D has members m(LTD) and m(Date), the latter is a forwarder. >>> Therefore E has the same members (instance methods are inherited). >> From a source perspective, E has the same names of members, although >> it has overridden the contents of m(Date). >> >>> >>> Here's how I would imagine this turns into in the VM: >> not important, but this was m(LDT) not m(LTD) >>> >>> ??? class D { >>> ??????? void m(LTD) { real method } >>> ??????? void m(Date d) { m(adapt(d)); }? // generated forwarder >>> ??? } >>> >>> ??? class E extends D { >>> ??????? private void m$synthetic(Date d) { real method, body as >>> present in classfile } >> I would expect that the existing m(Date) with the real method would >> stay unchanged - including >> the name and the access controls - since there may be clients of >> subclass E still trying to invoke it. > > I think this is our point of disconnect. > > The subclass has overridden a forwarder.? What we want to do is "heal > the rift" by rewriting the subclass as if it had _only_ overridden the > real method.? Hence, the "shunt it off to a synthetic" and create an > overriding reverser that overrides the real method, adapting > args/return, which delegates to the shuntee. > > If we left m(Date) in E, then this would be overriding the forwarder, > effectively un-doing the effect of forwarding. > > Note that this is all "as if"; there are a hundred ways to _actually_ do > it. Another way to see this effect is to say that it actually override the forwarder but locally, just for that class, in subclasses, the forwarder is still present This leads us to the next question, given that you can only override "locally" a forwarder, what if a forwarder overrides a forwarder ? You throw a LinkageError ? R?mi From brian.goetz at oracle.com Fri Apr 12 14:33:44 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 12 Apr 2019 10:33:44 -0400 Subject: Updated VM-bridges document In-Reply-To: <1015395595.1489546.1555052543852.JavaMail.zimbra@u-pem.fr> References: <3C1BFFE8-C200-4942-BBC0-4D95652C751E@oracle.com> <9DBFB771-66F2-41C2-9D50-A55B6194DF10@oracle.com> <1015395595.1489546.1555052543852.JavaMail.zimbra@u-pem.fr> Message-ID: <6f4ccab1-1511-3d37-5287-2e189bc64908@oracle.com> > This leads us to the next question, given that you can only override "locally" a forwarder, what if a forwarder overrides a forwarder ? You throw a LinkageError ? Yes, this could arise from inconsistent separate compilation (I thought I covered this in my doc?)? Best choice is probably to let the override proceed, establishing a new forwarder in that slot.? (A lot of the time when this happens, it will be forwarding to the same place anyway.)?? The is the same thing we do with bridges overriding bridges. A good mental model (for my brain) here is that forwarders act a little like final methods.? When a method overrides a final method, we throw a hard error.? But here, when we see a method overriding a "final-ish" method, if it is a regular method, we shunt it out of the way, and if it is a new final-ish method, we let it take over the slot. From Daniel_Heidinga at ca.ibm.com Fri Apr 12 15:16:52 2019 From: Daniel_Heidinga at ca.ibm.com (Daniel Heidinga) Date: Fri, 12 Apr 2019 15:16:52 +0000 Subject: RefObject and ValObject In-Reply-To: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> Message-ID: During the last EG call, I suggested there are benefits to having both RefObject and ValObject be classes rather than interfaces. Old code should be able work with both values and references (that's the promise of L-World after all!). New code should be able to opt into whether it wants to handle only references or values as there are APIs that may only make sense for one or the other. A good example of this is java.lang.Reference-subtypes which can't reasonably deal with values. Having RefObject in their method signatures would ensure that they're never passed a ValObject. (ie: the ctor becomes WeakReference(RefObject o) {...}) For good or ill, interfaces are not checked by the verifier. They're passed as though they are object and the interface check is delayed until invokeinterface, etc. Using interfaces for Ref/Val Object doesn't provide verifier guarantees that the methods will never be passed the wrong type. Javac may not generate the code but the VM can't count on that being the case due to bytecode instrumentation, other compilers, etc. Using classes does provide a strong guarantee to the VM which will help to alleviate any costs (acmp, array access) for methods that are declared in terms of RefObject and ensures that the user is getting exactly what they asked for when they declared their method to take RefObject. It does leave some oddities as you mention: * new Object() -> returns a new RefObject * getSuperclass() for old code may return a new superclass (though this may be the case already when using instrumentation in the classfile load hook) * others? though adding interfaces does as well: * getInterfaces() would return an interface not declared in the source * Object would need to implement RefObject for the 'new Object()` case which would mean all values implemented RefObject (yuck!) Letting users say what they mean and have it strongly enforced by the verifier is preferable in my view, especially as getSuperclass() issue will only apply to old code as newly compiled code will have the correct superclass in its classfile. --Dan -----"valhalla-spec-experts" wrote: ----- >To: valhalla-spec-experts >From: Brian Goetz >Sent by: "valhalla-spec-experts" >Date: 04/08/2019 04:00PM >Subject: RefObject and ValObject > >We never reached consensus on how to surface Ref/ValObject. > >Here are some places we might want to use these type names: > > - Parameter types / variables: we might want to restrict the domain >of a parameter or variable to only hold a reference, or a value: > > void m(RefObject ro) { ? } > > - Type bounds: we might want to restrict the instantiation of a >generic class to only hold a reference (say, because we?re going to >lock on it): > > class Foo { ? } > > - Dynamic tests: if locking on a value is to throw, there must be a >reasonable idiom that users can use to detect lockability without >just trying to lock: > > if (x instanceof RefObject) { > synchronized(x) { ? } > } > > - Ref- or Val-specific methods. This one is more vague, but its >conceivable we may want methods on ValObject that are members of all >values. > > >There?s been three ways proposed (so far) that we might reflect these >as top types: > > - RefObject and ValObject are (somewhat special) classes. We spell >(at least in the class file) ?value class? as ?class X extends >ValObject?. We implicitly rewrite reference classes at runtime that >extend Object to extend RefObject instead. This has obvious >pedagogical value, but there are some (small) risks of anomalies. > > - RefObject and ValObject are interfaces. We ensure that no class >can implement both. (Open question whether an interface could extend >one or the other, acting as an implicit constraint that it only be >implemented by value classes or reference classes.). Harder to do >things like put final implementations of wait/notify in ValObject, >though maybe this isn?t of as much value as it would have been if >we?d done this 25 years ago. > > - Split the difference; ValObject is a class, RefObject is an >interface. Sounds weird at first, but acknowledges that we?re >grafting this on to refs after the fact, and eliminates most of the >obvious anomalies. > >No matter which way we go, we end up with an odd anomaly: ?new >Object()? should yield an instance of RefObject, but we don?t want >Object <: RefObject for obvious reasons. Its possible that ?new >Object()? could result in an instance of a _species_ of Object that >implement RefObject? but our theory of species doesn?t quite go there >and it seems a little silly to add new requirements just for this. > > > > From brian.goetz at oracle.com Fri Apr 12 15:44:41 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 12 Apr 2019 11:44:41 -0400 Subject: Updated VM-bridges document In-Reply-To: <9DBFB771-66F2-41C2-9D50-A55B6194DF10@oracle.com> References: <3C1BFFE8-C200-4942-BBC0-4D95652C751E@oracle.com> <9DBFB771-66F2-41C2-9D50-A55B6194DF10@oracle.com> Message-ID: > A VM perspective: > > *invocation* > > *dynamic receiver* > > *resolution* > *NOT invoked* > > *selection:* > *actual execution* > invokevirtual D::m(LDT) > > D > > D.m(LDT) > > D.m(LDT) > invokevirtual D::m(LDT) > > E > > D.m(LDT) > > E.m(LDT) > reverser: adapt LDT->Date > ? invoke local E.m(Date) > ?? ? ? ? ? ? ? if return had changed, adapt return back > invokevirtual D::m(Date) > > D > > D.m(Date) > > D.m(Date) > forwarder: adapt Date->LDT > ?? ? ? ? ? ? ? ? invoke local m(LDT) > ?? ? ? ? ? ? ? ? if return had changed, adapt > invokevirtual D.m(Date) > > E > > D.m(Date) > > E.m(Date) > invokevirtual E.m(LDT) > > E > > E.m(LDT) > reverser) > > E.m(LDT): > reverser: adapt LDT->Date > invoke local E.m(Date) > ?? ? ? ? ? ? ? if return had changed, adapt return back > invokevirtual E.m(Date) > > E > > E.m(Date) > > E.m(Date) // original - unchanged behavior > > Let me try from the other direction, using the JVMS terminology rather than appealing to as-if.? Where I think we're saying slightly different things is in the interpretation of the lines I colored in blue above (hope the formatting came through.)? You are talking about the E.m(Date) that appears in E.class (good so far).? But I'm talking about the _members_ of E.? And the E.m(Date) that appears in E.class should _not_ be considered a (new-to-E) member of E. Instead, that E.m(Date) gives rise to a synthetic member E.m(LDT). I have colored two cases in red because I think this is where our assumptions really parted ways; will come back to this at the bottom. Here's why I'm harping on this distinction; we mark methods as "forwarders" and do something special when we see something override a forwarder.? Taking the same hierarchy: ??? // before ??? class D { ??????? void m(Date) { } ??? } ??? class E extends D { ??????? void m(Date) { } ??? } ??? // middle -- D migrates, but E not yet ??? class D { ??????? void m(LDT) { } ??????? @Forwarding( m(LDT) } void m(Date); ??? } ??? class E extends D { ??????? void m(Date) { } ??? } ??? // after -- E finally gets the memo ??? class D { ??????? void m(LDT) { } ??????? @Forwarding( m(LDT) } void m(Date); ??? } ??? class E extends D { ??????? void m(LDT) { } ??? } Now, let's draw inheritance diagrams (these are not vtables, they are member tables).? I'll use your notation, where I think D.m(X) means "the Code attribute declaredin D for m(X)". Before m(Date) D D.m(Date) E E.m(Date) This part is easy; D has m(Date), and E overrides it. Middle m(LDT) m(Date) D D.m(LDT) forwarder -> m(LDT) E reverser adapted from E.m(Date) inherits forwarder Now, both D and E have both m(Date) and m(LDT).? D has a real method for m(LDT), and a forwarder for m(Date).? E has an m(Date), which we see overrides a forwarder.? So we adapt it to be an m(LDT), but we consider E to have inherited the forwarder from D.? I'll come back to this in a minute. After m(LDT) m(Date) D D.m(LDT) forwarder -> m(LDT) E E.m(LDT) inherits forwarder In this nirvana, there is a forwarder still, but it doesn't affect E, because E has already gotten the memo.? It sits around purely in the case that someone calls m(Date). OK, so why am I saying that membership has to be tilted this way? Let's go back to the middle case, and add ??? class F extends E { ??????? void m(Date) { } // still didn't get the memo ??? } Middle m(LDT) m(Date) D D.m(LDT) forwarder -> m(LDT) E reverser adapted from E.m(Date) inherits forwarder F reverser adapted from F.m(Date) inherits forwarder When we go to compute members, I want to see that _F.m(Date) overrides a forwarder too_.? If we merely put E.m(Date) in the (E, m(Date)) box, then it looks like F is overriding an ordinary member, and no reverser is generated.? (Or, we have to keep walking up the chain to see if E.m(Date) in turn overrides a forwarder -- yuck, plus, that makes forwarder-overrides-forwarder even messier. Now, back to your table.? The above interpretation of what is going on comes to the same answer for all of the rows of your table, except these: > > *invocation* > > *dynamic receiver* > > *resolution* > *NOT invoked* > > *selection:* > *actual execution* > invokevirtual D.m(Date) > > E > > D.m(Date) > > E.m(Date) > invokevirtual E.m(Date) > > E > > E.m(Date) > > E.m(Date) // original - unchanged behavior > > You are thinking "E has a perfectly good m(Date), let's just select that".? Makes sense, but the cost of that is that it complicates calculation of membership and overriding.? I think I am content to let invocations of m(Date) on receivers of type E go through both rounds of adaptation: forward the call (with adaptation) to m(LDT), which, in the case of E, does the reverse adaptations and ends up at the original Code attribute of E.m(Date).? This sounds ugly (and we'd need to justify some potential failures) but leads us to a simpler interpretation of migration. In your model, we basically have to split the box in two: After m(LDT) m(Date) D D.m(LDT) forwarder -> m(LDT) E E.m(LDT) E.m(Date), but also is viewed as a forwarder by subclasses I think its a good goal, but I was trying to eliminate that complexity by accepting the round-trip adaptation -- which goes away when E gets the memo. From brian.goetz at oracle.com Fri Apr 12 15:51:37 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 12 Apr 2019 11:51:37 -0400 Subject: RefObject and ValObject In-Reply-To: References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> Message-ID: <4b4e643a-447e-8089-1382-7d45f2aa1428@oracle.com> High-order tradeoffs: ?- Having R/VObject be classes helps from the pedagogical perspective (it paints an accurate map of the object model.) ?- There were some anomalies raised that were the result of rewriting an Object supertype to RefObject, and some concerns about "all our tables got one level deeper."? I don't really have a strong opinion on these. ?- Using interfaces is less intrusive, but less powerful. ?- None of the approaches give an obvious solution for the "make me a lock Object" problem. I think the useful new observation in this line of discussion is this: ?- The premise of L-World is that legacy Object-consuming code can keep working with values. ?- We think that's a good thing. ?- But .... we also think there will be some cases where that's not a good thing, and that code will wish it had said `m(RefObject)` instead of `m(Object)`.? [ this is the new thing ] Combining this with the migration stuff going on in a separate thread, I think what you're saying is you want to be able to take a method: ??? m(Object o) { } and _migrate_ it to be ??? m(RefObject o) { } with a forwarder ??? @ForwardTo( m(RefObject) ) ??? m(Object o); So that code could, eventually, be migrated to RefObject-consuming code, and all is good again.? And the JIT can see that o is a RefObject and credibly fall back to a legacy interpretation of ACMP and locking. On 4/12/2019 11:16 AM, Daniel Heidinga wrote: > During the last EG call, I suggested there are benefits to having both RefObject and ValObject be classes rather than interfaces. > > Old code should be able work with both values and references (that's the promise of L-World after all!). New code should be able to opt into whether it wants to handle only references or values as there are APIs that may only make sense for one or the other. A good example of this is java.lang.Reference-subtypes which can't reasonably deal with values. Having RefObject in their method signatures would ensure that they're never passed a ValObject. (ie: the ctor becomes WeakReference(RefObject o) {...}) > > For good or ill, interfaces are not checked by the verifier. They're passed as though they are object and the interface check is delayed until invokeinterface, etc. Using interfaces for Ref/Val Object doesn't provide verifier guarantees that the methods will never be passed the wrong type. Javac may not generate the code but the VM can't count on that being the case due to bytecode instrumentation, other compilers, etc. > > Using classes does provide a strong guarantee to the VM which will help to alleviate any costs (acmp, array access) for methods that are declared in terms of RefObject and ensures that the user is getting exactly what they asked for when they declared their method to take RefObject. > > It does leave some oddities as you mention: > * new Object() -> returns a new RefObject > * getSuperclass() for old code may return a new superclass (though this may be the case already when using instrumentation in the classfile load hook) > * others? > > though adding interfaces does as well: > * getInterfaces() would return an interface not declared in the source > * Object would need to implement RefObject for the 'new Object()` case which would mean all values implemented RefObject (yuck!) > > Letting users say what they mean and have it strongly enforced by the verifier is preferable in my view, especially as getSuperclass() issue will only apply to old code as newly compiled code will have the correct superclass in its classfile. > > --Dan > > > -----"valhalla-spec-experts" wrote: ----- > >> To: valhalla-spec-experts >> From: Brian Goetz >> Sent by: "valhalla-spec-experts" >> Date: 04/08/2019 04:00PM >> Subject: RefObject and ValObject >> >> We never reached consensus on how to surface Ref/ValObject. >> >> Here are some places we might want to use these type names: >> >> - Parameter types / variables: we might want to restrict the domain >> of a parameter or variable to only hold a reference, or a value: >> >> void m(RefObject ro) { ? } >> >> - Type bounds: we might want to restrict the instantiation of a >> generic class to only hold a reference (say, because we?re going to >> lock on it): >> >> class Foo { ? } >> >> - Dynamic tests: if locking on a value is to throw, there must be a >> reasonable idiom that users can use to detect lockability without >> just trying to lock: >> >> if (x instanceof RefObject) { >> synchronized(x) { ? } >> } >> >> - Ref- or Val-specific methods. This one is more vague, but its >> conceivable we may want methods on ValObject that are members of all >> values. >> >> >> There?s been three ways proposed (so far) that we might reflect these >> as top types: >> >> - RefObject and ValObject are (somewhat special) classes. We spell >> (at least in the class file) ?value class? as ?class X extends >> ValObject?. We implicitly rewrite reference classes at runtime that >> extend Object to extend RefObject instead. This has obvious >> pedagogical value, but there are some (small) risks of anomalies. >> >> - RefObject and ValObject are interfaces. We ensure that no class >> can implement both. (Open question whether an interface could extend >> one or the other, acting as an implicit constraint that it only be >> implemented by value classes or reference classes.). Harder to do >> things like put final implementations of wait/notify in ValObject, >> though maybe this isn?t of as much value as it would have been if >> we?d done this 25 years ago. >> >> - Split the difference; ValObject is a class, RefObject is an >> interface. Sounds weird at first, but acknowledges that we?re >> grafting this on to refs after the fact, and eliminates most of the >> obvious anomalies. >> >> No matter which way we go, we end up with an odd anomaly: ?new >> Object()? should yield an instance of RefObject, but we don?t want >> Object <: RefObject for obvious reasons. Its possible that ?new >> Object()? could result in an instance of a _species_ of Object that >> implement RefObject? but our theory of species doesn?t quite go there >> and it seems a little silly to add new requirements just for this. >> >> >> >> From karen.kinnear at oracle.com Fri Apr 12 22:21:37 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Fri, 12 Apr 2019 18:21:37 -0400 Subject: Updated VM-bridges document In-Reply-To: References: <3C1BFFE8-C200-4942-BBC0-4D95652C751E@oracle.com> <9DBFB771-66F2-41C2-9D50-A55B6194DF10@oracle.com> Message-ID: <7805E2E4-041E-4886-8123-AA2070741FBF@oracle.com> I need to do many more additional examples offline. I appreciate your trying to make overriding of forwarders simpler for the jvm. I would like to continue to explore the option having the jvm do the calculation of overriding both direct and indirect forwarders until we?ve worked more examples. If we can find a way to do it, it helps with backward compatibility - old clients with old receivers don?t go through adaptors - so they miss adaptations that could either throw an exception or potentially lose data through narrowing. - same issue for reflection - Class.getDeclaredMethods() which just returns local methods - would be nice if we could not lose existing method names here I am also exploring invoke local with explicit local name of method - so we can try to reduce loops - I believe there will be steps at which we will need to identify loops and throw exceptions or not create a reverser. That said, it is getting more complex, so glad you are exploring alternatives. Link below spells out a bit more the rule I am exploring for creating reversers, with both the example below and another example (Example II) which has three migration steps, F <: E <: D all start with m(Date, Time) step 1: D m(Date, Time) -> D.m(LDT, Time) step 2: E.m(Date, Time) -> E.m( Date, LDT) step 3: D.m(LDT, Time) -> D.m(LDT, LDT) http://cr.openjdk.java.net/~acorn/Forwarders.pdf thanks, Karen > On Apr 12, 2019, at 11:44 AM, Brian Goetz wrote: > > > >> A VM perspective: >> >> invocation >> dynamic receiver >> resolution >> NOT invoked >> selection: >> actual execution >> invokevirtual D::m(LDT) >> D >> D.m(LDT) >> D.m(LDT) >> invokevirtual D::m(LDT) >> E >> D.m(LDT) >> E.m(LDT) >> reverser: adapt LDT->Date >> invoke local E.m(Date) >> if return had changed, adapt return back >> invokevirtual D::m(Date) >> D >> D.m(Date) >> D.m(Date) >> forwarder: adapt Date->LDT >> invoke local m(LDT) >> if return had changed, adapt >> invokevirtual D.m(Date) >> E >> D.m(Date) >> E.m(Date) >> invokevirtual E.m(LDT) >> E >> E.m(LDT) >> reverser) >> E.m(LDT): >> reverser: adapt LDT->Date >> invoke local E.m(Date) >> if return had changed, adapt return back >> invokevirtual E.m(Date) >> E >> E.m(Date) >> E.m(Date) // original - unchanged behavior >> > > Let me try from the other direction, using the JVMS terminology rather than appealing to as-if. Where I think we're saying slightly different things is in the interpretation of the lines I colored in blue above (hope the formatting came through.) You are talking about the E.m(Date) that appears in E.class (good so far). But I'm talking about the _members_ of E. And the E.m(Date) that appears in E.class should _not_ be considered a (new-to-E) member of E. Instead, that E.m(Date) gives rise to a synthetic member E.m(LDT). I have colored two cases in red because I think this is where our assumptions really parted ways; will come back to this at the bottom. > > Here's why I'm harping on this distinction; we mark methods as "forwarders" and do something special when we see something override a forwarder. Taking the same hierarchy: > > // before > class D { > void m(Date) { } > } > > class E extends D { > void m(Date) { } > } > > // middle -- D migrates, but E not yet > class D { > void m(LDT) { } > @Forwarding( m(LDT) } void m(Date); > } > > class E extends D { > void m(Date) { } > } > > // after -- E finally gets the memo > class D { > void m(LDT) { } > @Forwarding( m(LDT) } void m(Date); > } > > class E extends D { > void m(LDT) { } > } > > Now, let's draw inheritance diagrams (these are not vtables, they are member tables). I'll use your notation, where I think D.m(X) means "the Code attribute declared in D for m(X)". > > Before > m(Date) > D > D.m(Date) > E > E.m(Date) > > This part is easy; D has m(Date), and E overrides it. > > Middle > m(LDT) > m(Date) > D > D.m(LDT) > forwarder -> m(LDT) > E > reverser adapted from E.m(Date) > inherits forwarder > > Now, both D and E have both m(Date) and m(LDT). D has a real method for m(LDT), and a forwarder for m(Date). E has an m(Date), which we see overrides a forwarder. So we adapt it to be an m(LDT), but we consider E to have inherited the forwarder from D. I'll come back to this in a minute. > > After m(LDT) > m(Date) > D > D.m(LDT) > forwarder -> m(LDT) > E > E.m(LDT) > inherits forwarder > > In this nirvana, there is a forwarder still, but it doesn't affect E, because E has already gotten the memo. It sits around purely in the case that someone calls m(Date). > > OK, so why am I saying that membership has to be tilted this way? Let's go back to the middle case, and add > > class F extends E { > void m(Date) { } // still didn't get the memo > } > > > Middle > m(LDT) > m(Date) > D > D.m(LDT) > forwarder -> m(LDT) > E > reverser adapted from E.m(Date) > inherits forwarder > F > reverser adapted from F.m(Date) > inherits forwarder > > When we go to compute members, I want to see that _F.m(Date) overrides a forwarder too_. If we merely put E.m(Date) in the (E, m(Date)) box, then it looks like F is overriding an ordinary member, and no reverser is generated. (Or, we have to keep walking up the chain to see if E.m(Date) in turn overrides a forwarder -- yuck, plus, that makes forwarder-overrides-forwarder even messier. > > Now, back to your table. The above interpretation of what is going on comes to the same answer for all of the rows of your table, except these: > >> >> invocation >> dynamic receiver >> resolution >> NOT invoked >> selection: >> actual execution >> invokevirtual D.m(Date) >> E >> D.m(Date) >> E.m(Date) >> invokevirtual E.m(Date) >> E >> E.m(Date) >> E.m(Date) // original - unchanged behavior >> > > You are thinking "E has a perfectly good m(Date), let's just select that". Makes sense, but the cost of that is that it complicates calculation of membership and overriding. I think I am content to let invocations of m(Date) on receivers of type E go through both rounds of adaptation: forward the call (with adaptation) to m(LDT), which, in the case of E, does the reverse adaptations and ends up at the original Code attribute of E.m(Date). This sounds ugly (and we'd need to justify some potential failures) but leads us to a simpler interpretation of migration. > > In your model, we basically have to split the box in two: > > > After m(LDT) > m(Date) > D > D.m(LDT) > forwarder -> m(LDT) > E > E.m(LDT) > E.m(Date), but also is viewed as a forwarder by subclasses > > I think its a good goal, but I was trying to eliminate that complexity by accepting the round-trip adaptation -- which goes away when E gets the memo. > > From maurizio.cimadamore at oracle.com Mon Apr 15 11:02:18 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 15 Apr 2019 12:02:18 +0100 Subject: RefObject and ValObject In-Reply-To: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> Message-ID: <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> The gordian knot here is that whenever you see Object as part of a *type* you'd like to interpret it as a top type (the root of the hierarchy). OTOH, when you see Object as part of an *expression* you'd like to interpret it as something else (e.g. RefObject). And there are also things in between - e.g. Object.class, where it's not uber clear whether you want one or the other, but let's not get too distracted by these for the moment. From a type-system perspective, the basic property called _preservation_ mandate (Gavin, correct me if I'm wrong :-)) that, given an expression of *static type* E, the type associated with the value V obtained by executing the expression must be a subtype of E, which we can write: typeof(V) <: E (of course this is only true for type systems which feature subtyping, otherwise typeof(V) == E). So, in our case, the theorem demands something like this: typeof(new Object()) <: Object But it seems like we have already ruled this out - since, if typeof(new Object()) is 'RefObject', you don't want RefObject <: Object. So, from a type-system perspective, we're on unsound territory, at least assuming we only use classes w/ single inheritance. Interfaces (or, more generally, multiple inheritance) add a bit of flexibility because (as Brian said) we could say: typeof(new Object()) = XYZ, where XYZ <: RefObject && XYZ <: Object So this would satisfy the type theory; whether it can be made into something that looks compelling for a Java user, that's another story. Note that the dual nature of *type* vs. *expression* mentioned at the beginning will bite you as soon as you start doing things like this: void m(RefObject ro) { ... } m(new Object()) // ok, as per above m((Object)new Object()) // not ok? m(Object.class.newInstance()) // WAT!? Maurizio On 08/04/2019 20:58, Brian Goetz wrote: > No matter which way we go, we end up with an odd anomaly: ?new Object()? should yield an instance of RefObject, but we don?t want Object <: RefObject for obvious reasons. Its possible that ?new Object()? could result in an instance of a_species_ of Object that implement RefObject? but our theory of species doesn?t quite go there and it seems a little silly to add new requirements just for this. From brian.goetz at oracle.com Mon Apr 15 12:06:25 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Apr 2019 08:06:25 -0400 Subject: RefObject and ValObject In-Reply-To: <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> Message-ID: <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> > But it seems like we have already ruled this out - since, if > typeof(new Object()) is 'RefObject', you don't want RefObject <: Object. I think you misunderstood the "want" here (though it still may not be possible.) The desired model is: ?- Object is the top type.? Everything is an Object. ?- Some objects have identity (RefObject), others do not (ValObject).? But they are all Object. This means we want {Ref,Val}Object <: Object.? (Whether they are interface or class or something else.) One of the main reasons for wanting this setup is that it reflects the desired reality: everything is an object, but some are special objects (those with identity.)? The addition of value types is a big perturbation to the type system; reflecting it this way makes the object hierarchy reflect the reality and the desired intuition, and makes the distinction between ref/val slightly less magic. (There are other reasons too; for example, wouldn't it be nice if ValObject.{wait,notify,notifyAll} were _ordinary final methods_ that threw in ValObject?? Again, slightly less magic.) From maurizio.cimadamore at oracle.com Mon Apr 15 12:20:35 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 15 Apr 2019 13:20:35 +0100 Subject: RefObject and ValObject In-Reply-To: <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> Message-ID: <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> On 15/04/2019 13:06, Brian Goetz wrote: > > >> But it seems like we have already ruled this out - since, if >> typeof(new Object()) is 'RefObject', you don't want RefObject <: Object. > > I think you misunderstood the "want" here (though it still may not be > possible.) > > The desired model is: > ?- Object is the top type.? Everything is an Object. > ?- Some objects have identity (RefObject), others do not (ValObject).? > But they are all Object. > > This means we want {Ref,Val}Object <: Object.? (Whether they are > interface or class or something else.) Seems like I've read this requirement: "but we don?t want Object <: RefObject for obvious reasons" backwards. So, what this means is that it would be type-sound regardless of interface vs. class choice. But the other concerns remain, e.g. as to the fact that the boundary between reinterpreted types (Object as RefObject) and non-reinterpreted types (Object as top type) seems very fuzzy. Maurizio > > One of the main reasons for wanting this setup is that it reflects the > desired reality: everything is an object, but some are special objects > (those with identity.)? The addition of value types is a big > perturbation to the type system; reflecting it this way makes the > object hierarchy reflect the reality and the desired intuition, and > makes the distinction between ref/val slightly less magic. > > (There are other reasons too; for example, wouldn't it be nice if > ValObject.{wait,notify,notifyAll} were _ordinary final methods_ that > threw in ValObject?? Again, slightly less magic.) > > From forax at univ-mlv.fr Mon Apr 15 13:15:29 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 15 Apr 2019 15:15:29 +0200 (CEST) Subject: RefObject and ValObject In-Reply-To: <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> Message-ID: <415011631.238521.1555334129830.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Maurizio Cimadamore" > ?: "Brian Goetz" , "valhalla-spec-experts" > Envoy?: Lundi 15 Avril 2019 14:20:35 > Objet: Re: RefObject and ValObject > On 15/04/2019 13:06, Brian Goetz wrote: >> >> >>> But it seems like we have already ruled this out - since, if >>> typeof(new Object()) is 'RefObject', you don't want RefObject <: Object. >> >> I think you misunderstood the "want" here (though it still may not be >> possible.) >> >> The desired model is: >> ?- Object is the top type.? Everything is an Object. >> ?- Some objects have identity (RefObject), others do not (ValObject). >> But they are all Object. >> >> This means we want {Ref,Val}Object <: Object.? (Whether they are >> interface or class or something else.) > > Seems like I've read this requirement: > > "but we don?t want Object <: RefObject for obvious reasons" > > backwards. So, what this means is that it would be type-sound regardless > of interface vs. class choice. But the other concerns remain, e.g. as to > the fact that the boundary between reinterpreted types (Object as > RefObject) and non-reinterpreted types (Object as top type) seems very > fuzzy. yes, divorcing the runtime class from the type is something we can do, but it's not because we can do that we should, as a teacher you usually don't want to talk about the difference between a class and a type until you have reached the subtyping chapter. The initial goal is to make the concept of ref type and value type easier to grasp by providing a simple hierarchy, but new Object() can not be retconed to a RefObject easily, so it's not that simple. And Ruby has tried to do something similar by introducing BasicObject in 1.9 (for another reason, because scopes are liked to classes in Ruby), at the end few people cares, so we have to be careful to not introduce something that will be a hurdle in compatibility for in the end no benefit. So we need to be able to have types representing any Object, ref Object and value Object, but i think that RefObject and ValObject are not the only solution for that. > > Maurizio R?mi > >> >> One of the main reasons for wanting this setup is that it reflects the >> desired reality: everything is an object, but some are special objects >> (those with identity.)? The addition of value types is a big >> perturbation to the type system; reflecting it this way makes the >> object hierarchy reflect the reality and the desired intuition, and >> makes the distinction between ref/val slightly less magic. >> >> (There are other reasons too; for example, wouldn't it be nice if >> ValObject.{wait,notify,notifyAll} were _ordinary final methods_ that >> threw in ValObject?? Again, slightly less magic.) >> From brian.goetz at oracle.com Mon Apr 15 13:26:44 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Apr 2019 09:26:44 -0400 Subject: RefObject and ValObject In-Reply-To: <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> Message-ID: <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> > But the other concerns remain, e.g. as to the fact that the boundary between reinterpreted types (Object as RefObject) and non-reinterpreted types (Object as top type) seems very fuzzy. Right, which is why we?re still searching for an answer :) We really, really want to be able to represent ref/val-ness in the type system. Why? Ignoring pedagogical concerns (which are significant): - If certain operations (e.g., locking) are partial, we want to be able to provide a way to ask if the operation could succeed. Such as: if (x instanceof RefObejct) { ? lock on x ? } Saying ?lock, and hope it doesn?t throw? is not a very good answer. We already have a tool for querying the dynamic type of an object ? instanceof. - Saying that a method should only accept reference objects should be something expressible in the method signature, as in m(RefObject o) { ? } Types are how we do that. - Similarly, we might want to express the above constraint generically; again, types are the way we do that: class Foo { } And, Q-world already taught us that we wanted to retain Object as the top type. This mean, necessarily, that Object gets a little weirder; it takes on some partly-class, partly-interface behavior. Here?s an idea: What if we migrated `Object` to be an abstract class? The casualty would be the code that says `new Object()`. While there?s certainly a lot of it out there, perhaps this is something amenable to migration: - At the source level, for N versions, `new Object()` gets a warning that says ?I?ll pretend you said `Object.newLockInstance()` or something. - At the bytemode level, for M versions, we do something similar, likely for M > N. We can start this now, before Valhalla even previews. From maurizio.cimadamore at oracle.com Mon Apr 15 14:30:24 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 15 Apr 2019 15:30:24 +0100 Subject: RefObject and ValObject In-Reply-To: <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> Message-ID: Mu suggestion here is to leave subtyping on the side, at least for now and use some other way to describe what's going on. Let's model the value vs. reference constraint explicitly - e.g. with 'val' and 'ref' type kinds (and let's not open the can worm as to what the syntax should be, whether it should be a type anno, etc.) So: val Object ---> accepts all values ref Object ---> accepts all references any Object ---> accepts all references Now that everything is explicit, for types we have two possible moves: 1) reinterpret `Object` as `ref Object` This will keep semantics as is - that is, upon recompilation, if source code doesn't change, a program that expected references cannot start receiving value parameters. If code wants to work on both references and values, it will have to opt in (by using `any`). 2) reinterpret `Object` as `any Object` That is, the semantics of `Object` is being redefined here - code which assumed to work with references might need to opt-in to additional constraints (e.g. add `ref`) in order to make sure it still work as intended. I think we are leaning towards (2) - that is, we want meaning of Object to be upgraded, and we want users that are not happy with that to opt out in some form. Ok, now let's think about expressions; given an expression E, I have to figure out (i) its type and its (ii) kind, since the type system I'm describing here takes both factors into account. Here I'm expecting rules of the kind: a) if E is an identifier pointing to a variable decl, then type and kind are derived from the declaration b) if E is a method call, where declared method is M, type and kind are derived from M's return type declaration ... z) if E is a new expression, of the kind `new T()`, the type is T and the kind can be either `ref` or `val` depending on whether T is a reference class or not. If T can be both, then kind `any` is inferred. So, we can use (z) e.g. to say that `new String()` has kind `ref`. But, if we want Object to be the top type for both values and references, I believe one consequence is that `new Object` is interpreted as `any` which means you cannot pass it to `ref Object`. I don't see another way out of this conundrum - other than adding a special rule (z2) which says that `new Object()` is treated specially and always has kind `ref`. But doing so will run afoul in almost every possible way - as soon as you manipulate the result of the `new` in any way (cast, assignment to variable of type `Object`, ...) you go back to `any` and you are back to a place that is incompatible with `ref Object`. Your idea of treating Object as abstract is, I believe, a sound one (which doesn't need any extra rule) - but we might have to figure out some story for anonymous inner classes of the kind `new Object() { ... }`. Maurizio On 15/04/2019 14:26, Brian Goetz wrote: > > >> But the other concerns remain, e.g. as to the fact that the boundary >> between reinterpreted types (Object as RefObject) and >> non-reinterpreted types (Object as top type) seems very fuzzy. > > Right, which is why we?re still searching for an answer :) > > We really, really want to be able to represent ref/val-ness in the > type system. ?Why? ?Ignoring pedagogical concerns (which are > significant): > > ?- If certain operations (e.g., locking) are partial, we want to be > able to provide a way to ask if the operation could succeed. ?Such as: > > ? ? if (x instanceof RefObejct) { ? lock on x ? } > > Saying ?lock, and hope it doesn?t throw? is not a very good answer. > ?We already have a tool for querying the dynamic type of an object ? > instanceof. > > ?- Saying that a method should only accept reference objects should be > something expressible in the method signature, as in > > ? ? m(RefObject o) { ? } > > Types are how we do that. > > ?- Similarly, we might want to express the above constraint > generically; again, types are the way we do that: > > ? ? class Foo { } > > > And, Q-world already taught us that we wanted to retain Object as the > top type. ?This mean, necessarily, that Object gets a little weirder; > it takes on some partly-class, partly-interface behavior. > > Here?s an idea: What if we migrated `Object` to be an abstract class? > ?The casualty would be the code that says `new Object()`. ?While > there?s certainly a lot of it out there, perhaps this is something > amenable to migration: > > ?- At the source level, for N versions, `new Object()` gets a warning > that says ?I?ll pretend you said `Object.newLockInstance()` or something. > ?- At the bytemode level, for M versions, we do something similar, > likely for M > N. > > We can start this now, before Valhalla even previews. > > From brian.goetz at oracle.com Mon Apr 15 14:38:58 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Apr 2019 10:38:58 -0400 Subject: RefObject and ValObject In-Reply-To: References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> Message-ID: > Let's model the value vs. reference constraint explicitly - e.g. with 'val' and 'ref' type kinds (and let's not open the can worm as to what the syntax should be, whether it should be a type anno, etc.) > > So: > > val Object ---> accepts all values > ref Object ---> accepts all references > any Object ---> accepts all references > We explored this sort of thing in Q world. One place where this really was painful was what it did to generic tvar constraints; we had an entire new algebra of constraints: void m(T t) { ? } which was ad hoc and composed terribly. This hell is _exactly_ the thing that pushed us to Ref/ValObject as _types_ in the first place. (More of the same: what is ?ref Object?.class, and how does it differ from ?any Object?.class?). > 2) reinterpret `Object` as `any Object` > > That is, the semantics of `Object` is being redefined here - code which assumed to work with references might need to opt-in to additional constraints (e.g. add `ref`) in order to make sure it still work as intended. > Right. Q-world tried it the other way, and we were in utter migration hell. There are migration cases in this direction too, but we are convinced they are orders of magnitude fewer. > I don't see another way out of this conundrum - other than adding a special rule (z2) which says that `new Object()` is treated specially and always has kind `ref`. But doing so will run afoul in almost every possible way - as soon as you manipulate the result of the `new` in any way (cast, assignment to variable of type `Object`, ...) you go back to `any` and you are back to a place that is incompatible with `ref Object`. > Yes, this is the cost. I have to think that given a choice between some weirdness around ?new Object?, and dramatic, awful new kinds of types that complicate type uses, type descriptors, reflection, etc, etc, etc, that making the former work is going to be less painful, both for us and for users. > Your idea of treating Object as abstract is, I believe, a sound one (which doesn't need any extra rule) - but we might have to figure out some story for anonymous inner classes of the kind `new Object() { ... }`. > Right. And, again, this can be treated as a migration issue, and we can start warning users to migrate their source now. From maurizio.cimadamore at oracle.com Mon Apr 15 14:46:56 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 15 Apr 2019 15:46:56 +0100 Subject: RefObject and ValObject In-Reply-To: References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> Message-ID: <819181bc-cbf7-f831-5fca-5ad16c0d709a@oracle.com> On 15/04/2019 15:38, Brian Goetz wrote: >> >> Let's model the value vs. reference constraint explicitly - e.g. with >> 'val' and 'ref' type kinds (and let's not open the can worm as to >> what the syntax should be, whether it should be a type anno, etc.) >> >> So: >> >> val Object ---> accepts all values >> ref Object ---> accepts all references >> any Object ---> accepts all references >> > > We explored this sort of thing in Q world. ?One place where this > really was painful was what it did to generic tvar constraints; we had > an entire new algebra of constraints: > > ? ? void m(T t) { ? } > > which was ad hoc and composed terribly. ?This hell is _exactly_ the > thing that pushed us to Ref/ValObject as _types_ in the first place. > ?(More of the same: what is ?ref Object?.class, and how does it differ > from ?any Object?.class?). This is not a language proposal (as stated in the email). Just a way to be clearer about semantics, w/o appealing to subclasses and type hierarchies (which are, at this point, IMHO confusing). > >> 2) reinterpret `Object` as `any Object` >> >> That is, the semantics of `Object` is being redefined here - code >> which assumed to work with references might need to opt-in to >> additional constraints (e.g. add `ref`) in order to make sure it >> still work as intended. >> > Right. ?Q-world tried it the other way, and we were in utter migration > hell. ?There are migration cases in this direction too, but we are > convinced they are orders of magnitude fewer. >> >> I don't see another way out of this conundrum - other than adding a >> special rule (z2) which says that `new Object()` is treated specially >> and always has kind `ref`. But doing so will run afoul in almost >> every possible way - as soon as you manipulate the result of the >> `new` in any way (cast, assignment to variable of type `Object`, ...) >> you go back to `any` and you are back to a place that is incompatible >> with `ref Object`. >> > Yes, this is the cost. ?I have to think that given a choice between > some weirdness around ?new Object?, and dramatic, awful new kinds of > types that complicate type uses, type descriptors, reflection, etc, > etc, etc, that making the former work is going to be less painful, > both for us and for users. Well, not quite. The choice is between treating 'new Object' specially or not. If it's not treated specially there is no new scary type. It's just that`new Object` might not have the properties one might hope for (but again, those same properties would be lost as soon as you touch the result of the expression). I think you took my type modifiers too literally (as if, I'm proposing to add them, which I'm not) :-) Maurizio > >> Your idea of treating Object as abstract is, I believe, a sound one >> (which doesn't need any extra rule) - but we might have to figure out >> some story for anonymous inner classes of the kind `new Object() { >> ... }`. >> > > Right. ?And, again, this can be treated as a migration issue, and we > can start warning users to migrate their source now. > > > > From brian.goetz at oracle.com Mon Apr 15 14:57:42 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Apr 2019 10:57:42 -0400 Subject: RefObject and ValObject In-Reply-To: <819181bc-cbf7-f831-5fca-5ad16c0d709a@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <819181bc-cbf7-f831-5fca-5ad16c0d709a@oracle.com> Message-ID: > > Well, not quite. The choice is between treating 'new Object' specially or not. If it's not treated specially there is no new scary type. It's just that`new Object` might not have the properties one might hope for (but again, those same properties would be lost as soon as you touch the result of the expression). OK, but let?s not lose sight of the fact that ?new Object()? is not so much a language _feature_, as much as an accidental convention that people have settled on for accomplishing a very specific thing. In a way, it?s like the so-called-but-not-really-a-feature double-brace idiom; it?s an accidental consequence of how the language works, that people have come to use because it?s convenient. I would much rather spend our complexity budget migrating uses of this idiom away and get a simpler type system, than the other way around. From forax at univ-mlv.fr Mon Apr 15 15:03:57 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 15 Apr 2019 17:03:57 +0200 (CEST) Subject: RefObject and ValObject In-Reply-To: References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> Message-ID: <237852347.288358.1555340637176.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Maurizio Cimadamore" > Cc: "valhalla-spec-experts" > Envoy?: Lundi 15 Avril 2019 16:38:58 > Objet: Re: RefObject and ValObject >> Let's model the value vs. reference constraint explicitly - e.g. with 'val' and >> 'ref' type kinds (and let's not open the can worm as to what the syntax should >> be, whether it should be a type anno, etc.) >> So: >> val Object ---> accepts all values >> ref Object ---> accepts all references >> any Object ---> accepts all references > We explored this sort of thing in Q world. One place where this really was > painful was what it did to generic tvar constraints; we had an entire new > algebra of constraints: > void m(T t) { ? } Brian, val, ref and any only applies to Object. if you prefer in term of notation, let's rename ref Object to Object?, val Object to Object! and any Object to Object*, you can apply ? on any value type, Object included, you can apply ! and * only on Object. Object? is erased to Ljava/lang/Object; by the generic signature is Ljava/lang/Object/* (so it's Object for the VM and Object! for the compiler), Object! is erased to Qjav/lang/Object; Object* is erased to Ljava/lang/Object; As a bound of a type variable, Object is equivalent to Object? by backward compatibility. and void m(T t) { ? } means that T is not nullable, so it's a value type. And at runtime, instead of (o instanceof Object!) one will write o.getClass().isValue() R?mi > which was ad hoc and composed terribly. This hell is _exactly_ the thing that > pushed us to Ref/ValObject as _types_ in the first place. (More of the same: > what is ?ref Object?.class, and how does it differ from ?any Object?.class?). >> 2) reinterpret `Object` as `any Object` >> That is, the semantics of `Object` is being redefined here - code which assumed >> to work with references might need to opt-in to additional constraints (e.g. >> add `ref`) in order to make sure it still work as intended. > Right. Q-world tried it the other way, and we were in utter migration hell. > There are migration cases in this direction too, but we are convinced they are > orders of magnitude fewer. >> I don't see another way out of this conundrum - other than adding a special rule >> (z2) which says that `new Object()` is treated specially and always has kind >> `ref`. But doing so will run afoul in almost every possible way - as soon as >> you manipulate the result of the `new` in any way (cast, assignment to variable >> of type `Object`, ...) you go back to `any` and you are back to a place that is >> incompatible with `ref Object`. > Yes, this is the cost. I have to think that given a choice between some > weirdness around ?new Object?, and dramatic, awful new kinds of types that > complicate type uses, type descriptors, reflection, etc, etc, etc, that making > the former work is going to be less painful, both for us and for users. >> Your idea of treating Object as abstract is, I believe, a sound one (which >> doesn't need any extra rule) - but we might have to figure out some story for >> anonymous inner classes of the kind `new Object() { ... }`. > Right. And, again, this can be treated as a migration issue, and we can start > warning users to migrate their source now. From brian.goetz at oracle.com Mon Apr 15 15:23:36 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Apr 2019 11:23:36 -0400 Subject: RefObject and ValObject In-Reply-To: References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> Message-ID: > > Your idea of treating Object as abstract is, I believe, a sound one (which doesn't need any extra rule) - but we might have to figure out some story for anonymous inner classes of the kind `new Object() { ... }`. After thinking about it for all of five minutes, I think this may have broken the logjam (or, at least the current logjam.). We?ve been asking ourselves whether RO/VO are classes or interfaces, when we didn?t really consider abstract classes. Which we didn?t consider because we had assumed that the concrete-ness of Object was nailed down. Let?s assume it?s not. Then we have: abstract class Object { } abstract class RefObject <: Object { } abstract class ValObject <: Object { } Existing classes that extend Object are silently reparented to RefObject, both at compile time and runtime. This may have some small .getSuperclass() anomalies but this seems pretty minor. Same with anon classes of Object. Inline classes implicitly extend ValObject. We add a method `Object::newInstance` (name to be bikeshod later.). We start warning in the compiler on `new Object`, to motivate migration to `Object::newInstance`. Runtime rewrites these too. There are some minor behavioral compatibility issues here, but they seem pretty minor, and in the end, we end up with a hierarchy that describes the way we want users to see the type system. From brian.goetz at oracle.com Mon Apr 15 15:27:26 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Apr 2019 11:27:26 -0400 Subject: RefObject and ValObject In-Reply-To: <237852347.288358.1555340637176.JavaMail.zimbra@u-pem.fr> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <237852347.288358.1555340637176.JavaMail.zimbra@u-pem.fr> Message-ID: <8A65C066-A577-405E-A752-2F5ADC59329B@oracle.com> Please, no. > On Apr 15, 2019, at 11:03 AM, Remi Forax wrote: > > > > De: "Brian Goetz" > ?: "Maurizio Cimadamore" > Cc: "valhalla-spec-experts" > Envoy?: Lundi 15 Avril 2019 16:38:58 > Objet: Re: RefObject and ValObject > Let's model the value vs. reference constraint explicitly - e.g. with 'val' and 'ref' type kinds (and let's not open the can worm as to what the syntax should be, whether it should be a type anno, etc.) > > So: > > val Object ---> accepts all values > ref Object ---> accepts all references > any Object ---> accepts all references > > > We explored this sort of thing in Q world. One place where this really was painful was what it did to generic tvar constraints; we had an entire new algebra of constraints: > > void m(T t) { ? } > > Brian, > val, ref and any only applies to Object. > > if you prefer in term of notation, let's rename ref Object to Object?, val Object to Object! and any Object to Object*, > > you can apply ? on any value type, Object included, > you can apply ! and * only on Object. > > Object? is erased to Ljava/lang/Object; by the generic signature is Ljava/lang/Object/* (so it's Object for the VM and Object! for the compiler), > Object! is erased to Qjav/lang/Object; > Object* is erased to Ljava/lang/Object; > > As a bound of a type variable, Object is equivalent to Object? by backward compatibility. > and > void m(T t) { ? } > means that T is not nullable, so it's a value type. > > And at runtime, instead of > (o instanceof Object!) > one will write > o.getClass().isValue() > > R?mi > > > which was ad hoc and composed terribly. This hell is _exactly_ the thing that pushed us to Ref/ValObject as _types_ in the first place. (More of the same: what is ?ref Object?.class, and how does it differ from ?any Object?.class?). > > 2) reinterpret `Object` as `any Object` > > That is, the semantics of `Object` is being redefined here - code which assumed to work with references might need to opt-in to additional constraints (e.g. add `ref`) in order to make sure it still work as intended. > > Right. Q-world tried it the other way, and we were in utter migration hell. There are migration cases in this direction too, but we are convinced they are orders of magnitude fewer. > I don't see another way out of this conundrum - other than adding a special rule (z2) which says that `new Object()` is treated specially and always has kind `ref`. But doing so will run afoul in almost every possible way - as soon as you manipulate the result of the `new` in any way (cast, assignment to variable of type `Object`, ...) you go back to `any` and you are back to a place that is incompatible with `ref Object`. > > Yes, this is the cost. I have to think that given a choice between some weirdness around ?new Object?, and dramatic, awful new kinds of types that complicate type uses, type descriptors, reflection, etc, etc, etc, that making the former work is going to be less painful, both for us and for users. > > Your idea of treating Object as abstract is, I believe, a sound one (which doesn't need any extra rule) - but we might have to figure out some story for anonymous inner classes of the kind `new Object() { ... }`. > > > Right. And, again, this can be treated as a migration issue, and we can start warning users to migrate their source now. From maurizio.cimadamore at oracle.com Mon Apr 15 15:29:32 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 15 Apr 2019 16:29:32 +0100 Subject: RefObject and ValObject In-Reply-To: <237852347.288358.1555340637176.JavaMail.zimbra@u-pem.fr> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <237852347.288358.1555340637176.JavaMail.zimbra@u-pem.fr> Message-ID: Again, I was not proposing we went down there (I like RefObject, ValObject). But... using subclassing to have a semantics discussion has issues on its own (e.g. should we use interfaces or classes), which I think are distracting at this point. Hence my use of an alternate notation in an attempt to make problems pop out more clearly. But, since the notation was already used in past proposals, it seems like I opened a can of worms :-). To me the important questions are: 1) TYPES - utterances of `Object` in a type positon; what do they mean? Do they mean VALUE or REFERENCE, or TOP-TYPE? 2) EXPRESSIONS - utterances of `Object` inside expressions such as `new Object()`, what do they mean, again VALUE, REFERENCE or TOP-TYPE? For (1), I think we are all in agreement that utterances of Object-as-a-type means TOP-TYPE. For (2) the proposal I saw earlier said something like, we'd like for `new Object` to mean REFERENCE. I think that is a siren song, and this issue has been addressed by Brian's proposal to deal with `new Object` as a migration problem. Rather than suggesting to use some static factory instead (which works for plain instance creation but not for inner class creation) I think perhaps the user code should be migrated to use `new RefObject` and `new RefObject() { }` instead (at least if people want the reference semantics). But these are minor details. The more important fact is that, even if we make Object abstract, you could still make new Object() { ... } So there's still the issue of how that expression is interpreted. I think the answer has gotta be, again, TOP-TYPE (as for question (1)). So: RefObject ro = new Object() { ... } //error Maurizio On 15/04/2019 16:03, Remi Forax wrote: > > > ------------------------------------------------------------------------ > > *De: *"Brian Goetz" > *?: *"Maurizio Cimadamore" > *Cc: *"valhalla-spec-experts" > *Envoy?: *Lundi 15 Avril 2019 16:38:58 > *Objet: *Re: RefObject and ValObject > > Let's model the value vs. reference constraint explicitly - > e.g. with 'val' and 'ref' type kinds (and let's not open the > can worm as to what the syntax should be, whether it should be > a type anno, etc.) > > So: > > val Object ---> accepts all values > ref Object ---> accepts all references > any Object ---> accepts all references > > > We explored this sort of thing in Q world. ?One place where this > really was painful was what it did to generic tvar constraints; we > had an entire new algebra of constraints: > > ? ? void m(T t) { ? } > > > Brian, > val, ref and any only applies to Object. > > if you prefer in term of notation, let's rename ref Object to Object?, > val Object to Object! and any Object to Object*, > > you can apply ? on any value type, Object included, > you can apply ! and * only on Object. > > Object? is erased to Ljava/lang/Object; by the generic signature is > Ljava/lang/Object/* (so it's Object for the VM and Object! for the > compiler), > Object! is erased to Qjav/lang/Object; > Object* is erased to Ljava/lang/Object; > > As a bound of a type variable, Object is equivalent to Object? by > backward compatibility. > and > ? void m(T t) { ? } > means that T is not nullable, so it's a value type. > > And at runtime, instead of > ?? (o instanceof Object!) > one will write > ? o.getClass().isValue() > > R?mi > > > which was ad hoc and composed terribly. ?This hell is _exactly_ > the thing that pushed us to Ref/ValObject as _types_ in the first > place. ?(More of the same: what is ?ref Object?.class, and how > does it differ from ?any Object?.class?). > > 2) reinterpret `Object` as `any Object` > > That is, the semantics of `Object` is being redefined here - > code which assumed to work with references might need to > opt-in to additional constraints (e.g. add `ref`) in order to > make sure it still work as intended. > > Right. ?Q-world tried it the other way, and we were in utter > migration hell. ?There are migration cases in this direction too, > but we are convinced they are orders of magnitude fewer. > > I don't see another way out of this conundrum - other than > adding a special rule (z2) which says that `new Object()` is > treated specially and always has kind `ref`. But doing so will > run afoul in almost every possible way - as soon as you > manipulate the result of the `new` in any way (cast, > assignment to variable of type `Object`, ...) you go back to > `any` and you are back to a place that is incompatible with > `ref Object`. > > Yes, this is the cost. ?I have to think that given a choice > between some weirdness around ?new Object?, and dramatic, awful > new kinds of types that complicate type uses, type descriptors, > reflection, etc, etc, etc, that making the former work is going to > be less painful, both for us and for users. > > Your idea of treating Object as abstract is, I believe, a > sound one (which doesn't need any extra rule) - but we might > have to figure out some story for anonymous inner classes of > the kind `new Object() { ... }`. > > > Right. ?And, again, this can be treated as a migration issue, and > we can start warning users to migrate their source now. > > > > > From brian.goetz at oracle.com Mon Apr 15 15:35:53 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Apr 2019 11:35:53 -0400 Subject: RefObject and ValObject In-Reply-To: References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <237852347.288358.1555340637176.JavaMail.zimbra@u-pem.fr> Message-ID: > For (2) the proposal I saw earlier said something like, we'd like for `new Object` to mean REFERENCE. I think that is a siren song, and this issue has been addressed by Brian's proposal to deal with `new Object` as a migration problem. Rather than suggesting to use some static factory instead (which works for plain instance creation but not for inner class creation) I think perhaps the user code should be migrated to use `new RefObject` and `new RefObject() { }` instead (at least if people want the reference semantics). But these are minor details > What I like about treating this as a migration problem is, that despite the inconvenience, the resulting code is actually _a lot more clear_. The first time I saw ?new Object()? (that was a long time ago), I remember thinking ?What the heck is the point of that??, until I realized that Objects had a secret object identity. Whereas ?new IdentityObject()? is more clear that you are creating an instance _for the precise purpose of using its identity_. So, while there is some migration pain, the resulting language is actually more clear. I like that. From forax at univ-mlv.fr Mon Apr 15 17:59:19 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 15 Apr 2019 19:59:19 +0200 (CEST) Subject: RefObject and ValObject In-Reply-To: References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> Message-ID: <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Maurizio Cimadamore" > Cc: "valhalla-spec-experts" > Envoy?: Lundi 15 Avril 2019 17:23:36 > Objet: Re: RefObject and ValObject >> Your idea of treating Object as abstract is, I believe, a sound one (which >> doesn't need any extra rule) - but we might have to figure out some story for >> anonymous inner classes of the kind `new Object() { ... }`. > After thinking about it for all of five minutes, I think this may have broken > the logjam (or, at least the current logjam.). We?ve been asking ourselves > whether RO/VO are classes or interfaces, when we didn?t really consider > abstract classes. Which we didn?t consider because we had assumed that the > concrete-ness of Object was nailed down. Let?s assume it?s not. > Then we have: > abstract class Object { } > abstract class RefObject <: Object { } > abstract class ValObject <: Object { } > Existing classes that extend Object are silently reparented to RefObject, both > at compile time and runtime. This may have some small .getSuperclass() > anomalies but this seems pretty minor. Same with anon classes of Object. Inline > classes implicitly extend ValObject. > We add a method `Object::newInstance` (name to be bikeshod later.). We start > warning in the compiler on `new Object`, to motivate migration to > `Object::newInstance`. Runtime rewrites these too. > There are some minor behavioral compatibility issues here, but they seem pretty > minor, and in the end, we end up with a hierarchy that describes the way we > want users to see the type system. It's not a minor change, and all code that uses a type parameter that have Object as bound will become ambiguous. R?mi From brian.goetz at oracle.com Mon Apr 15 18:00:52 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Apr 2019 14:00:52 -0400 Subject: RefObject and ValObject In-Reply-To: <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> Message-ID: <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> > > It's not a minor change, and all code that uses a type parameter that have Object as bound will become ambiguous. I don?t think so. You can?t say new T() when T is bounded at Object (or anything, for that matter.). What ambiguity are you afraid of here? From john.r.rose at oracle.com Mon Apr 15 18:13:50 2019 From: john.r.rose at oracle.com (John Rose) Date: Mon, 15 Apr 2019 11:13:50 -0700 Subject: RefObject and ValObject In-Reply-To: <237852347.288358.1555340637176.JavaMail.zimbra@u-pem.fr> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <237852347.288358.1555340637176.JavaMail.zimbra@u-pem.fr> Message-ID: <434F920A-0BE8-4CCD-8926-96E3DA275790@oracle.com> Avoid the Bikeshed! Make sure that temporary or provisional syntaxes look __Different from permanent ones. On Apr 15, 2019, at 8:03 AM, Remi Forax wrote: > > val, ref and any only applies to Object. > > if you prefer in term of notation, let's rename ref Object to Object?, val Object to Object! and any Object to Object*, > > you can apply ? on any value type, Object included, > you can apply ! and * only on Object. From john.r.rose at oracle.com Mon Apr 15 18:20:07 2019 From: john.r.rose at oracle.com (John Rose) Date: Mon, 15 Apr 2019 11:20:07 -0700 Subject: RefObject and ValObject In-Reply-To: References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> Message-ID: I think this hangs together very well. For legacy bytecode (only), the JVM has to be willing to do these one-time fudges: - rewrite new java/lang/Object to new java/lang/RefObject - rewrite invokespecial java/lang/Object.()V to invokespecial java/lang/RefObject.()V The verifier can observe these rewrites, but it may also take this additional step, for legacy code only: - widen the type of new java/lang/RefObject to plain Object (?which would prevent the verifier from passing the new RefObject to a method that actually takes RefObject. Not sure that step is useful; it's a "one hand clapping" type of move, which in practice won't be observable.) ? John > On Apr 15, 2019, at 8:23 AM, Brian Goetz wrote: > >> >> Your idea of treating Object as abstract is, I believe, a sound one (which doesn't need any extra rule) - but we might have to figure out some story for anonymous inner classes of the kind `new Object() { ... }`. > > After thinking about it for all of five minutes, I think this may have broken the logjam (or, at least the current logjam.). We?ve been asking ourselves whether RO/VO are classes or interfaces, when we didn?t really consider abstract classes. Which we didn?t consider because we had assumed that the concrete-ness of Object was nailed down. Let?s assume it?s not. > > Then we have: > > abstract class Object { } > abstract class RefObject <: Object { } > abstract class ValObject <: Object { } > > Existing classes that extend Object are silently reparented to RefObject, both at compile time and runtime. This may have some small .getSuperclass() anomalies but this seems pretty minor. Same with anon classes of Object. Inline classes implicitly extend ValObject. > > We add a method `Object::newInstance` (name to be bikeshod later.). We start warning in the compiler on `new Object`, to motivate migration to `Object::newInstance`. Runtime rewrites these too. > > There are some minor behavioral compatibility issues here, but they seem pretty minor, and in the end, we end up with a hierarchy that describes the way we want users to see the type system. > > From dl at cs.oswego.edu Mon Apr 15 18:52:59 2019 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 15 Apr 2019 14:52:59 -0400 Subject: RefObject and ValObject In-Reply-To: References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> Message-ID: On 4/15/19 11:23 AM, Brian Goetz wrote: > Then we have: > > ? ? abstract class Object { } > ? ? abstract class RefObject <: Object { } > ? ? abstract class ValObject <: Object { } > I also think it is plausible. As one part of migration story, maybe the public no-arg constructor could be explicitly listed as @Deprecated, adding a non-deprecated "protected" one. -Doug From maurizio.cimadamore at oracle.com Mon Apr 15 20:23:25 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 15 Apr 2019 21:23:25 +0100 Subject: RefObject and ValObject In-Reply-To: <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> Message-ID: <96b10f46-cd2d-cdcd-eeb4-658ec9344f94@oracle.com> Maybe (Remi correct me if I'm wrong), the problem Remi was referring to is that we also have existing generic declarations like which, in the new world, will mean either VALUE or REFERENCE. I think this is a consequence of the choice (1) I described in my email - e.g. reinterpret Object in type position as TOP_TYPE. Maurizio On 15/04/2019 19:00, Brian Goetz wrote: >> >> It's not a minor change, and all code that uses a type parameter that >> have Object as bound will become ambiguous. > > I don?t think so. ?You can?t say > > ? ? new T() > > when T is bounded at Object (or anything, for that matter.). > > What ambiguity are you afraid of here? > > From brian.goetz at oracle.com Mon Apr 15 20:25:59 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Apr 2019 16:25:59 -0400 Subject: RefObject and ValObject In-Reply-To: <96b10f46-cd2d-cdcd-eeb4-658ec9344f94@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <8003c917-528d-5444-26a7-dc32f180078b@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> <96b10f46-cd2d-cdcd-eeb4-658ec9344f94@oracle.com> Message-ID: <05C5E53F-6DFF-4908-8B19-8E13D82743DB@oracle.com> That?s right, as we think that will be the best default. Obviously in some cases users will want to re-bound to . (For existing code; this comes with compatibility concerns, which we can handle in various ways. The standard trick to change a bound without changing erasure is to bound at Object&X; this works when X is an interface but currently would be unhappy if X is a class. But this could be adjusted.) > On Apr 15, 2019, at 4:23 PM, Maurizio Cimadamore wrote: > > Maybe (Remi correct me if I'm wrong), the problem Remi was referring to is that we also have existing generic declarations like which, in the new world, will mean either VALUE or REFERENCE. I think this is a consequence of the choice (1) I described in my email - e.g. reinterpret Object in type position as TOP_TYPE. > > Maurizio > > On 15/04/2019 19:00, Brian Goetz wrote: >>> >>> It's not a minor change, and all code that uses a type parameter that have Object as bound will become ambiguous. >> >> I don?t think so. You can?t say >> >> new T() >> >> when T is bounded at Object (or anything, for that matter.). >> >> What ambiguity are you afraid of here? >> >> From forax at univ-mlv.fr Mon Apr 15 20:36:14 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 15 Apr 2019 22:36:14 +0200 (CEST) Subject: RefObject and ValObject In-Reply-To: <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> Message-ID: <2140481242.341796.1555360574234.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Maurizio Cimadamore" , > "valhalla-spec-experts" > Envoy?: Lundi 15 Avril 2019 20:00:52 > Objet: Re: RefObject and ValObject >> It's not a minor change, and all code that uses a type parameter that have >> Object as bound will become ambiguous. > I don?t think so. You can?t say > new T() > when T is bounded at Object (or anything, for that matter.). > What ambiguity are you afraid of here? 1) any codes that has inference var list = List.of(new Foo(), new Bar()); will be inferred as List instead of List, so calling a method that takes a List will not compile anymore. 2) any code that consider Object as special class may stop working, dynamic proxies that have a special case for Object, any code that reflect recursively on the hierarchy to find all the methods, any code that remove getClass() from the getter list, etc. R?mi From forax at univ-mlv.fr Mon Apr 15 20:37:37 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 15 Apr 2019 22:37:37 +0200 (CEST) Subject: RefObject and ValObject In-Reply-To: <434F920A-0BE8-4CCD-8926-96E3DA275790@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <237852347.288358.1555340637176.JavaMail.zimbra@u-pem.fr> <434F920A-0BE8-4CCD-8926-96E3DA275790@oracle.com> Message-ID: <1990823844.342060.1555360657461.JavaMail.zimbra@u-pem.fr> It's not bikeshedding, hence the "if you prefer ...", it's trying to explain that it's a typing issue and not a runtime class issue. R?mi ----- Mail original ----- > De: "John Rose" > ?: "Remi Forax" , "Maurizio Cimadamore" > Cc: "Brian Goetz" , "valhalla-spec-experts" > Envoy?: Lundi 15 Avril 2019 20:13:50 > Objet: Re: RefObject and ValObject > Avoid the Bikeshed! Make sure that temporary or provisional > syntaxes look __Different from permanent ones. > > On Apr 15, 2019, at 8:03 AM, Remi Forax wrote: >> >> val, ref and any only applies to Object. >> >> if you prefer in term of notation, let's rename ref Object to Object?, val >> Object to Object! and any Object to Object*, >> >> you can apply ? on any value type, Object included, > > you can apply ! and * only on Object. From brian.goetz at oracle.com Mon Apr 15 20:39:43 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Apr 2019 16:39:43 -0400 Subject: RefObject and ValObject In-Reply-To: <2140481242.341796.1555360574234.JavaMail.zimbra@u-pem.fr> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> <2140481242.341796.1555360574234.JavaMail.zimbra@u-pem.fr> Message-ID: <442B67D8-67B7-41EB-9EF1-247AFE281F48@oracle.com> > > 1) any codes that has inference > var list = List.of(new Foo(), new Bar()); > will be inferred as List instead of List, so calling a method that takes a List will not compile anymore. Not to dismiss the ?what about inference? issue, but any code that combines `var` with `List,of()` should not be surprised at the result of inference ?. But, ?what about inference? noted. > 2) any code that consider Object as special class may stop working, dynamic proxies that have a special case for Object, any code that reflect recursively on the hierarchy to find all the methods, any code that remove getClass() from the getter list, etc. Absorbing value types is going to require some changes on the part of most frameworks to get best results. This one seems in that category? Again, not to dismiss, keep them coming, but so far this isn?t scaring me away from having the object model that makes most sense. From forax at univ-mlv.fr Mon Apr 15 20:43:38 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 15 Apr 2019 22:43:38 +0200 (CEST) Subject: RefObject and ValObject In-Reply-To: <96b10f46-cd2d-cdcd-eeb4-658ec9344f94@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> <96b10f46-cd2d-cdcd-eeb4-658ec9344f94@oracle.com> Message-ID: <78272184.343791.1555361018506.JavaMail.zimbra@u-pem.fr> > De: "Maurizio Cimadamore" > ?: "Brian Goetz" , "Remi Forax" > Cc: "valhalla-spec-experts" > Envoy?: Lundi 15 Avril 2019 22:23:25 > Objet: Re: RefObject and ValObject > Maybe (Remi correct me if I'm wrong), the problem Remi was referring to is that > we also have existing generic declarations like which, in > the new world, will mean either VALUE or REFERENCE. I think this is a > consequence of the choice (1) I described in my email - e.g. reinterpret Object > in type position as TOP_TYPE. > Maurizio yes ! all generics will suddenly accept value types. R?mi > On 15/04/2019 19:00, Brian Goetz wrote: >>> It's not a minor change, and all code that uses a type parameter that have >>> Object as bound will become ambiguous. >> I don?t think so. You can?t say >> new T() >> when T is bounded at Object (or anything, for that matter.). >> What ambiguity are you afraid of here? From brian.goetz at oracle.com Mon Apr 15 20:46:08 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Apr 2019 16:46:08 -0400 Subject: RefObject and ValObject In-Reply-To: <78272184.343791.1555361018506.JavaMail.zimbra@u-pem.fr> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> <96b10f46-cd2d-cdcd-eeb4-658ec9344f94@oracle.com> <78272184.343791.1555361018506.JavaMail.zimbra@u-pem.fr> Message-ID: <315CC43C-CF5E-477C-865D-CBE9D1852415@oracle.com> > yes ! > all generics will suddenly accept value types. > Yes, this is by design. If you can?t have an ArrayList of Point, that would be terrible. Of course, until we have specialization (later in the story), these will be erased, and restricted to the nullable projection. From forax at univ-mlv.fr Mon Apr 15 20:54:55 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 15 Apr 2019 22:54:55 +0200 (CEST) Subject: RefObject and ValObject In-Reply-To: <442B67D8-67B7-41EB-9EF1-247AFE281F48@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> <2140481242.341796.1555360574234.JavaMail.zimbra@u-pem.fr> <442B67D8-67B7-41EB-9EF1-247AFE281F48@oracle.com> Message-ID: <1500947471.344881.1555361695806.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Maurizio Cimadamore" , > "valhalla-spec-experts" > Envoy?: Lundi 15 Avril 2019 22:39:43 > Objet: Re: RefObject and ValObject >> 1) any codes that has inference >> var list = List.of(new Foo(), new Bar()); >> will be inferred as List instead of List, so calling a method >> that takes a List will not compile anymore. > Not to dismiss the ?what about inference? issue, but any code that combines > `var` with `List,of()` should not be surprised at the result of inference ?. > But, ?what about inference? noted. you don't need var, var represent the inference value of an expression, instead of having two lines, you can always write m(List list ) { ... } ... m(List.of(new Foo(), new Bar())); >> 2) any code that consider Object as special class may stop working, dynamic >> proxies that have a special case for Object, any code that reflect recursively >> on the hierarchy to find all the methods, any code that remove getClass() from >> the getter list, etc. > Absorbing value types is going to require some changes on the part of most > frameworks to get best results. This one seems in that category? > Again, not to dismiss, keep them coming, but so far this isn?t scaring me away > from having the object model that makes most sense. but this compatibility issues have nothing to do with value types pre se, it's because of the introduction of RefObject, and currently all other compatibility issues can be solved at use site (i believe), here you are adding compatibility issue that can only be solved at declaration site, that's a big change because now as a user you have to wait until all your dependencies have been updated before using value type. R?mi From maurizio.cimadamore at oracle.com Mon Apr 15 20:53:49 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 15 Apr 2019 21:53:49 +0100 Subject: RefObject and ValObject In-Reply-To: <442B67D8-67B7-41EB-9EF1-247AFE281F48@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> <2140481242.341796.1555360574234.JavaMail.zimbra@u-pem.fr> <442B67D8-67B7-41EB-9EF1-247AFE281F48@oracle.com> Message-ID: I think we should be ok inference-wise, except of course for edge cases (like vars) which exposes the guts of the inference engine. In normal cases like: List list = List.of(new Foo(), new Bar()) the equality constraint on the LHS should force a solution which overrides the new 'RefObject' bound that comes in from the RHS. (that is, you have T = Object, then T :> Foo and T :> Bar, so T = Object 'wins', even if less precise) Maurizio On 15/04/2019 21:39, Brian Goetz wrote: >> >> 1) any codes that has inference >> ?????? var list = List.of(new Foo(), new Bar()); >> ??? will be inferred as List instead of List, so >> calling a method that takes a List will not compile anymore. > > Not to dismiss the ?what about inference? issue, but any code that > combines `var` with `List,of()` should not be surprised at the result > of inference ?. > > But, ?what about inference? noted. > >> 2) any code that consider Object as special class may stop working, >> dynamic proxies that have a special case for Object, any code that >> reflect recursively on the hierarchy to find all the methods, any >> code that remove getClass() from the getter list, etc. > > Absorbing value types is going to require some changes on the part of > most frameworks to get best results. ?This one seems in that category? > > Again, not to dismiss, keep them coming, but so far this isn?t scaring > me away from having the object model that makes most sense. > > From forax at univ-mlv.fr Mon Apr 15 21:01:26 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 15 Apr 2019 23:01:26 +0200 (CEST) Subject: RefObject and ValObject In-Reply-To: References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> <2140481242.341796.1555360574234.JavaMail.zimbra@u-pem.fr> <442B67D8-67B7-41EB-9EF1-247AFE281F48@oracle.com> Message-ID: <757810938.345440.1555362086732.JavaMail.zimbra@u-pem.fr> > De: "Maurizio Cimadamore" > ?: "Brian Goetz" , "Remi Forax" > Cc: "valhalla-spec-experts" > Envoy?: Lundi 15 Avril 2019 22:53:49 > Objet: Re: RefObject and ValObject > I think we should be ok inference-wise, except of course for edge cases (like > vars) which exposes the guts of the inference engine. > In normal cases like: > List list = List.of(new Foo(), new Bar()) > the equality constraint on the LHS should force a solution which overrides the > new 'RefObject' bound that comes in from the RHS. > (that is, you have T = Object, then T :> Foo and T :> Bar, so T = Object 'wins', > even if less precise) > Maurizio yes, i think you're right, R?mi > On 15/04/2019 21:39, Brian Goetz wrote: >>> 1) any codes that has inference >>> var list = List.of(new Foo(), new Bar()); >>> will be inferred as List instead of List, so calling a method >>> that takes a List will not compile anymore. >> Not to dismiss the ?what about inference? issue, but any code that combines >> `var` with `List,of()` should not be surprised at the result of inference ?. >> But, ?what about inference? noted. >>> 2) any code that consider Object as special class may stop working, dynamic >>> proxies that have a special case for Object, any code that reflect recursively >>> on the hierarchy to find all the methods, any code that remove getClass() from >>> the getter list, etc. >> Absorbing value types is going to require some changes on the part of most >> frameworks to get best results. This one seems in that category? >> Again, not to dismiss, keep them coming, but so far this isn?t scaring me away >> from having the object model that makes most sense. From forax at univ-mlv.fr Mon Apr 15 21:10:03 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 15 Apr 2019 23:10:03 +0200 (CEST) Subject: RefObject and ValObject In-Reply-To: <315CC43C-CF5E-477C-865D-CBE9D1852415@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> <96b10f46-cd2d-cdcd-eeb4-658ec9344f94@oracle.com> <78272184.343791.1555361018506.JavaMail.zimbra@u-pem.fr> <315CC43C-CF5E-477C-865D-CBE9D1852415@oracle.com> Message-ID: <1285920707.346056.1555362603986.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Maurizio Cimadamore" , > "valhalla-spec-experts" > Envoy?: Lundi 15 Avril 2019 22:46:08 > Objet: Re: RefObject and ValObject >> yes ! >> all generics will suddenly accept value types. > Yes, this is by design. If you can?t have an ArrayList of Point, that would be > terrible. Of course, until we have specialization (later in the story), these > will be erased, and restricted to the nullable projection. Does it means that Point? is a subtype of RefObject ? Note: the second question mark in this sentence is because it's a question. R?mi From brian.goetz at oracle.com Mon Apr 15 21:17:14 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Apr 2019 17:17:14 -0400 Subject: RefObject and ValObject In-Reply-To: <1285920707.346056.1555362603986.JavaMail.zimbra@u-pem.fr> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> <96b10f46-cd2d-cdcd-eeb4-658ec9344f94@oracle.com> <78272184.343791.1555361018506.JavaMail.zimbra@u-pem.fr> <315CC43C-CF5E-477C-865D-CBE9D1852415@oracle.com> <1285920707.346056.1555362603986.JavaMail.zimbra@u-pem.fr> Message-ID: <60997482-6CCA-4EF1-8039-10F35CA2A361@oracle.com> In the document on ?Towards a plan for L10 / L20? I tried to answer these, but I got it slightly wrong. I said: V <: V? <: ValObject <: Object But really that should be V <: V? <: ValObject? <: Object V <: ValObject <: Object V <: V? by value set inclusion; V? is the type obtained by adjoining `null` to the value set of V. I am leaning towards saying that `RefObject?` and `Object?` are not sensible things to write, because they are equal to `RefObject` and `Object`. (That?s separately from `T?`, which always makes sense, but sometimes just means ?T?.). No flavor of Point is a subtype of any flavor of RefObject; ValObject and RefObject are disjoint. > On Apr 15, 2019, at 5:10 PM, forax at univ-mlv.fr wrote: > > > > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Maurizio Cimadamore" , "valhalla-spec-experts" > Envoy?: Lundi 15 Avril 2019 22:46:08 > Objet: Re: RefObject and ValObject > yes ! > all generics will suddenly accept value types. > > > Yes, this is by design. If you can?t have an ArrayList of Point, that would be terrible. Of course, until we have specialization (later in the story), these will be erased, and restricted to the nullable projection. > > Does it means that Point? is a subtype of RefObject ? > > Note: the second question mark in this sentence is because it's a question. > > R?mi > From brian.goetz at oracle.com Mon Apr 15 21:23:57 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Apr 2019 17:23:57 -0400 Subject: RefObject and ValObject In-Reply-To: <60997482-6CCA-4EF1-8039-10F35CA2A361@oracle.com> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> <96b10f46-cd2d-cdcd-eeb4-658ec9344f94@oracle.com> <78272184.343791.1555361018506.JavaMail.zimbra@u-pem.fr> <315CC43C-CF5E-477C-865D-CBE9D1852415@oracle.com> <1285920707.346056.1555362603986.JavaMail.zimbra@u-pem.fr> <60997482-6CCA-4EF1-8039-10F35CA2A361@oracle.com> Message-ID: > V <: V? by value set inclusion; V? is the type obtained by adjoining `null` to the value set of V. Looking for.a better name for this. :?Nullable value types? is a terrible name, so I don?t want to say that. (Too confusing with null-default value types.). They could properly be define as ?null-adjoined value types?, but that?s not helpful if you don?t know what an adjunction is. Similarly for ?nullable projection?. But the basic idea is that V? is the denotation of the union type of V | Null. From john.r.rose at oracle.com Mon Apr 15 21:32:20 2019 From: john.r.rose at oracle.com (John Rose) Date: Mon, 15 Apr 2019 14:32:20 -0700 Subject: RefObject and ValObject In-Reply-To: References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> <96b10f46-cd2d-cdcd-eeb4-658ec9344f94@oracle.com> <78272184.343791.1555361018506.JavaMail.zimbra@u-pem.fr> <315CC43C-CF5E-477C-865D-CBE9D1852415@oracle.com> <1285920707.346056.1555362603986.JavaMail.zimbra@u-pem.fr> <60997482-6CCA-4EF1-8039-10F35CA2A361@oracle.com> Message-ID: The word "reference" is available and fits the bill. "nullable reference type" > On Apr 15, 2019, at 2:23 PM, Brian Goetz wrote: > >> V <: V? by value set inclusion; V? is the type obtained by adjoining `null` to the value set of V. > > Looking for.a better name for this. :?Nullable value types? is a terrible name, so I don?t want to say that. (Too confusing with null-default value types.). They could properly be define as ?null-adjoined value types?, but that?s not helpful if you don?t know what an adjunction is. Similarly for ?nullable projection?. > > But the basic idea is that V? is the denotation of the union type of V | Null. > From forax at univ-mlv.fr Mon Apr 15 21:32:41 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 15 Apr 2019 23:32:41 +0200 (CEST) Subject: RefObject and ValObject In-Reply-To: References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> <96b10f46-cd2d-cdcd-eeb4-658ec9344f94@oracle.com> <78272184.343791.1555361018506.JavaMail.zimbra@u-pem.fr> <315CC43C-CF5E-477C-865D-CBE9D1852415@oracle.com> <1285920707.346056.1555362603986.JavaMail.zimbra@u-pem.fr> <60997482-6CCA-4EF1-8039-10F35CA2A361@oracle.com> Message-ID: <990349351.347459.1555363961336.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "valhalla-spec-experts" > Envoy?: Lundi 15 Avril 2019 23:23:57 > Objet: Re: RefObject and ValObject >> V <: V? by value set inclusion; V? is the type obtained by adjoining `null` to >> the value set of V. > Looking for.a better name for this. :?Nullable value types? is a terrible name, > so I don?t want to say that. (Too confusing with null-default value types.). > They could properly be define as ?null-adjoined value types?, but that?s not > helpful if you don?t know what an adjunction is. Similarly for ?nullable > projection?. > But the basic idea is that V? is the denotation of the union type of V | Null. Ok, i think i got slightly confused about the difference between RefObject and ValueRef? R?mi From brian.goetz at oracle.com Mon Apr 15 21:33:22 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Apr 2019 17:33:22 -0400 Subject: RefObject and ValObject In-Reply-To: <990349351.347459.1555363961336.JavaMail.zimbra@u-pem.fr> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> <96b10f46-cd2d-cdcd-eeb4-658ec9344f94@oracle.com> <78272184.343791.1555361018506.JavaMail.zimbra@u-pem.fr> <315CC43C-CF5E-477C-865D-CBE9D1852415@oracle.com> <1285920707.346056.1555362603986.JavaMail.zimbra@u-pem.fr> <60997482-6CCA-4EF1-8039-10F35CA2A361@oracle.com> <990349351.347459.1555363961336.JavaMail.zimbra@u-pem.fr> Message-ID: Most of what i wrote in the L10/L20 memo is accurate :) > On Apr 15, 2019, at 5:32 PM, forax at univ-mlv.fr wrote: > > > > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "valhalla-spec-experts" > Envoy?: Lundi 15 Avril 2019 23:23:57 > Objet: Re: RefObject and ValObject > V <: V? by value set inclusion; V? is the type obtained by adjoining `null` to the value set of V. > > Looking for.a better name for this. :?Nullable value types? is a terrible name, so I don?t want to say that. (Too confusing with null-default value types.). They could properly be define as ?null-adjoined value types?, but that?s not helpful if you don?t know what an adjunction is. Similarly for ?nullable projection?. > > But the basic idea is that V? is the denotation of the union type of V | Null. > > > Ok, i think i got slightly confused about the difference between RefObject and ValueRef? > > R?mi > > > From maurizio.cimadamore at oracle.com Mon Apr 15 22:29:08 2019 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 15 Apr 2019 23:29:08 +0100 Subject: RefObject and ValObject In-Reply-To: <1500947471.344881.1555361695806.JavaMail.zimbra@u-pem.fr> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <804232137.323681.1555351159570.JavaMail.zimbra@u-pem.fr> <6717B25C-B9B6-4BD7-8475-5A52CEEF9D2A@oracle.com> <2140481242.341796.1555360574234.JavaMail.zimbra@u-pem.fr> <442B67D8-67B7-41EB-9EF1-247AFE281F48@oracle.com> <1500947471.344881.1555361695806.JavaMail.zimbra@u-pem.fr> Message-ID: <486429d0-830b-5b72-06f6-31a8ccf721c2@oracle.com> On 15/04/2019 21:54, forax at univ-mlv.fr wrote: > m(List list ) {? ... } > ? ... > ? m(List.of(new Foo(), new Bar())); > Slightly incorrect sir :-) This is effectively equivalent to the example I wrote earlier List list = List.of(...) The expected type will force the equality constraint and will drive inference home. Otherwise it would not even work today in cases when you pass e.g. Integer and String to List.of, as their common supertype is something sharper than Object. This stuff used indeed to fail pre-Java 8, but I think we cured most of the issues. The remaining ones are when inference eagerly resolves variables w/o looking at the target, because a target is not there, as for 'var', or because the expression is in a receiver position, as in: List list2 = List.of(new Foo(), new Bar()).subList(0, 42); Now, this will fail, but I believe we're in the corner^2 territory? Maurizio From john.r.rose at oracle.com Thu Apr 18 21:34:40 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 18 Apr 2019 14:34:40 -0700 Subject: RefObject and ValObject In-Reply-To: <1990823844.342060.1555360657461.JavaMail.zimbra@u-pem.fr> References: <7BDBD0E5-3C8A-4309-B88D-CD9C4F1AAE4C@oracle.com> <86be3b78-67ef-89a9-3475-bf68dd6308b1@oracle.com> <43809f9d-7790-2d08-95f5-b3ef67751872@oracle.com> <140E9BF1-805E-45F6-8C4A-B328B70FA25A@oracle.com> <237852347.288358.1555340637176.JavaMail.zimbra@u-pem.fr> <434F920A-0BE8-4CCD-8926-96E3DA275790@oracle.com> <1990823844.342060.1555360657461.JavaMail.zimbra@u-pem.fr> Message-ID: <5FD62766-2CD1-4CED-ABD9-AD096962DE88@oracle.com> On Apr 15, 2019, at 1:37 PM, forax at univ-mlv.fr wrote: > > It's not bikeshedding, hence the "if you prefer ...", > it's trying to explain that it's a typing issue and not a runtime class issue. Sure. The "if you prefer" is a subtle hint that bikeshedding is not intended. To avoid having your audience think that bikeshedding has begun, a *non-subtle* hint is even more helpful, which is why I like to use intentionally ugly pseudo-syntax (like __NWC_BLOB__, in another thread) during such discussions. > ----- Mail original ----- >> De: "John Rose" >> ?: "Remi Forax" , "Maurizio Cimadamore" >> Cc: "Brian Goetz" , "valhalla-spec-experts" >> Envoy?: Lundi 15 Avril 2019 20:13:50 >> Objet: Re: RefObject and ValObject > >> Avoid the Bikeshed! Make sure that temporary or provisional >> syntaxes look __Different from permanent ones. >> From john.r.rose at oracle.com Thu Apr 18 21:43:55 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 18 Apr 2019 14:43:55 -0700 Subject: Valhalla EG notes April 10, 2019 In-Reply-To: <979091f0-b85c-8840-ffff-8e60ea1ae21c@oracle.com> References: <34DC8D74-2976-425F-BF50-7258B633091D@oracle.com> <1A970A7B-AD56-4827-AE2E-B19221B39CBB@oracle.com> <979091f0-b85c-8840-ffff-8e60ea1ae21c@oracle.com> Message-ID: <87C145F1-65ED-40CC-A290-E3838BD36DFB@oracle.com> On Apr 11, 2019, at 1:44 PM, Brian Goetz wrote: > > To me, getting fancy here sounds like borrowing trouble; it seems much simpler -- and perfectly reasonable -- to reject cycles at both compile and runtime, and let users use `V?` in the place they want to break their cycles. (Assuming we're comfortable with `V?` means not flattened, which is the choice we're also making for specialization.) For the record, I share Brian's take here. Also, FTR, I'm comfortable saying V? means not flattened. I'll go further than that: I'd be *uncomfortable* if the meaning of V? diverged (unnecessarily) from the meaning of mentioning an equivalent non-inline class V2. That means V?, if translated to a descriptor, should translate to a simple legacy-style L-descriptor. IOW, I think V? is most useful if it means "behaves exactly like a legacy variable", which means not only "nullable" but also "not eagerly loaded". At the VM level, V? should be the way to avoid bridging to old L-descriptors (without any "L*" decoration). IOW again, the contract of V?, at least at the JVM level, should be exactly fulfillable by the L-V descriptor (without any extra signal). ? John From john.r.rose at oracle.com Thu Apr 18 22:08:35 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 18 Apr 2019 15:08:35 -0700 Subject: generic specialization design discussion In-Reply-To: <401126eb-614e-7584-fe8f-ac239421df4d@cs.oswego.edu> References: <213d368c-5948-2385-a1b4-982d189b5fc9@oracle.com> <25220157-E4FF-45A7-B8F3-A4A963AB286E@oracle.com> <0D86D3E0-A8B9-45FF-8A69-84BD037007B9@oracle.com> <69C2B250-8E17-4279-A661-C61285C1A230@oracle.com> <8234fa64-bd23-872d-b465-6ef838faf72e@cs.oswego.edu> <983278D5-7CF3-45AA-8FC1-51530C19641F@oracle.com> <1874D686-4E92-4888-9A54-3AF8CF37C916@oracle.com> <401126eb-614e-7584-fe8f-ac239421df4d@cs.oswego.edu> Message-ID: On Apr 10, 2019, at 5:11 AM, Doug Lea
wrote: > > But maybe Brian is right and "inline" is good enough. I think it's good enough, and I'm glad to get beyond "value". There's another shoe that needs to drop here, another term which is *not* good enough, for us to bikeshed: "reference". If we could get away with saying "value" and "reference" have a special meaning as adjectives, we could allow those terms, as nouns, to retain their standard meanings in the JVMS. Here's the background: The JVMS takes great care to use the terms "reference" and "value" with precision, and this occasionally surfaces in explanations meant for ordinary Java users. We have already released the term "value" from its new duties; I think we have an equal need to release "reference" from its new duties as well. (Put another way: The JVMS says that all a-series opcodes take "reference" arguments, and all L-series descriptors denotes reference variables. In L-world "reference" means a value which refers potentially to either an inline object or a non-inlinable identity-laden object. Changing this term is IMO not feasible. I'm arguing against overloading as well.) So what we need is a formal term NI which means "not inline", the opposite of the formal term "inline". (I pause for the Knights of Ni to ride by.) NI should not be spelled "reference" because that term is already committed elsewhere, and in particular we will have to say that inline objects are manipulated by references in the JVM. We don't need a *keyword* for NI. If pressed for such a thing, we could invent "non-inline" as a composite keyword. But the ValObject/RefObject classes provide a fine way to document programmer intent: If I want to prevent maintainers from sticking "inline" in front of my class, I can derive from RefObject explicitly. (Note that both Val and Ref are past their shelf life. More in a moment.) We *do* need a positive term for documentation. We want to say things like, "if the object is inline, do this, else the object is NI, so do that". And if we are talking about legacy object operations like Object::wait, it would be best if NI could express more than just the negation of inline, but had its own proper connotations that suggest the identity of a non-inline object. That's what I mean by a positive term. So, for example, NI could be some term which conveys the idea of being "identity-laden". Or it could convey synchronizable, or having a unique address/location/heap-block. Finally, we need to use the positive term inline and the positive term NI to construct the very useful type names formerly known as ValObject and RefObject. Clearly, those names should be readable in code as "inline object" and "NI object". Now for a NI bikeshed color. I think it is sufficient to use the term "identity" for NI. Thus, we would have: - inline classes and identity classes - inline types and identity types - the top types InlineObject and IdentityObject - inline objects and identity objects - inline values and identity values - inline references and identity references - informally, maybe "inlines" and "identities" (Or maybe something like InlineObj and IdentityObj or InObject and IdObject, if we feel the need to abbreviate.) What other colors are there for NI? ? John From forax at univ-mlv.fr Thu Apr 18 22:33:32 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 19 Apr 2019 00:33:32 +0200 (CEST) Subject: Valhalla EG notes April 10, 2019 In-Reply-To: <87C145F1-65ED-40CC-A290-E3838BD36DFB@oracle.com> References: <34DC8D74-2976-425F-BF50-7258B633091D@oracle.com> <1A970A7B-AD56-4827-AE2E-B19221B39CBB@oracle.com> <979091f0-b85c-8840-ffff-8e60ea1ae21c@oracle.com> <87C145F1-65ED-40CC-A290-E3838BD36DFB@oracle.com> Message-ID: <451530882.30888.1555626812425.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "John Rose" > ?: "Brian Goetz" > Cc: "valhalla-spec-experts" > Envoy?: Jeudi 18 Avril 2019 23:43:55 > Objet: Re: Valhalla EG notes April 10, 2019 > On Apr 11, 2019, at 1:44 PM, Brian Goetz wrote: >> >> To me, getting fancy here sounds like borrowing trouble; it seems much simpler >> -- and perfectly reasonable -- to reject cycles at both compile and runtime, >> and let users use `V?` in the place they want to break their cycles. (Assuming >> we're comfortable with `V?` means not flattened, which is the choice we're also >> making for specialization.) > > For the record, I share Brian's take here. so am i. > > Also, FTR, I'm comfortable saying V? means not flattened. > > I'll go further than that: I'd be *uncomfortable* if the > meaning of V? diverged (unnecessarily) from the meaning > of mentioning an equivalent non-inline class V2. That > means V?, if translated to a descriptor, should translate > to a simple legacy-style L-descriptor. > > IOW, I think V? is most useful if it means "behaves exactly > like a legacy variable", which means not only "nullable" > but also "not eagerly loaded". At the VM level, V? should > be the way to avoid bridging to old L-descriptors (without > any "L*" decoration). > > IOW again, the contract of V?, at least at the JVM level, should > be exactly fulfillable by the L-V descriptor (without any extra > signal). I mostly agree, V? should be not eagerly resolved, not flattenable in fields and arrays but - it should be vectorized in registers if on stack, i.e. V? should still be a mark for the JIT that the value doesn't escape because it can always be reconstructed when necessary. - acmp, System.identityHashCode(), etc, have the same meaning as V if the value is non null. > > ? John R?mi From john.r.rose at oracle.com Thu Apr 18 23:30:44 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 18 Apr 2019 16:30:44 -0700 Subject: Valhalla EG notes April 10, 2019 In-Reply-To: <451530882.30888.1555626812425.JavaMail.zimbra@u-pem.fr> References: <34DC8D74-2976-425F-BF50-7258B633091D@oracle.com> <1A970A7B-AD56-4827-AE2E-B19221B39CBB@oracle.com> <979091f0-b85c-8840-ffff-8e60ea1ae21c@oracle.com> <87C145F1-65ED-40CC-A290-E3838BD36DFB@oracle.com> <451530882.30888.1555626812425.JavaMail.zimbra@u-pem.fr> Message-ID: <243D1E5D-0DF0-479E-B6A6-F67A1F8CAE8F@oracle.com> On Apr 18, 2019, at 3:33 PM, Remi Forax wrote: > > - it should be vectorized in registers if on stack, i.e. V? should still be a mark for the JIT that the value doesn't escape because it can always be reconstructed when necessary. > - acmp, System.identityHashCode(), etc, have the same meaning as V if the value is non null. Yes, thanks for clarifying that. The L-descriptor contracts are mainly about containers per se, and only secondarily about the objects referred to by those containers. The treatment of acmp does not depend on the container but rather on the intrinsic properties of the object referred to by the reference stored in the container. And the JIT can cheat all it wants, as long as it upholds the user-visible contracts. That includes today's EA of today's "identity" objects, plus tomorrow's more robust scalarization of inline objects. Bottom line: When you see "V?" in the source code, you are looking at an L-descriptor in the class file, no matter what V is. When you see a flattenable or scalarizable "V" in the source code, you are looking at something new, with new capabilities and new restrictions: A new contract. Even when we add in null-default types NV, NV? should still translate to an L-descriptor (no L*) with legacy semantics. This may be a subtle contractual difference between NV? and NV, which we still need to talk through. A field of type NV is flattenable and must not be circular. A field of type NV? is not flattenable, as far as the user can see, and may be circular, if we take circularity as part of the inherent expressiveness of L-types. That's the way I see it. We can make 99% of the difference between NV and NV? disappear, starting with the observation that they have the same value sets. But the last 1% will, I think, be tricky to suppress. If we can make 100% of the semantic differences between NV and NV? disappear, then the distinct usage of legacy L-descriptors and new L*-descriptors (G-descriptors in a forthcoming document) will appear only as a distinction in translation strategy, which will uphold the unified semantic contract of NV and its alias NV?. From forax at univ-mlv.fr Fri Apr 19 00:09:57 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 19 Apr 2019 02:09:57 +0200 (CEST) Subject: Valhalla EG notes April 10, 2019 In-Reply-To: <243D1E5D-0DF0-479E-B6A6-F67A1F8CAE8F@oracle.com> References: <34DC8D74-2976-425F-BF50-7258B633091D@oracle.com> <1A970A7B-AD56-4827-AE2E-B19221B39CBB@oracle.com> <979091f0-b85c-8840-ffff-8e60ea1ae21c@oracle.com> <87C145F1-65ED-40CC-A290-E3838BD36DFB@oracle.com> <451530882.30888.1555626812425.JavaMail.zimbra@u-pem.fr> <243D1E5D-0DF0-479E-B6A6-F67A1F8CAE8F@oracle.com> Message-ID: <2031733373.36331.1555632597830.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "John Rose" > ?: "Remi Forax" > Cc: "Brian Goetz" , "valhalla-spec-experts" > Envoy?: Vendredi 19 Avril 2019 01:30:44 > Objet: Re: Valhalla EG notes April 10, 2019 > On Apr 18, 2019, at 3:33 PM, Remi Forax wrote: >> >> - it should be vectorized in registers if on stack, i.e. V? should still be a >> mark for the JIT that the value doesn't escape because it can always be >> reconstructed when necessary. >> - acmp, System.identityHashCode(), etc, have the same meaning as V if the value >> is non null. > > Yes, thanks for clarifying that. The L-descriptor contracts > are mainly about containers per se, and only secondarily > about the objects referred to by those containers. > > The treatment of acmp does not depend on the container > but rather on the intrinsic properties of the object referred > to by the reference stored in the container. > > And the JIT can cheat all it wants, as long as it upholds > the user-visible contracts. That includes today's EA > of today's "identity" objects, plus tomorrow's more robust > scalarization of inline objects. > > Bottom line: When you see "V?" in the source code, you > are looking at an L-descriptor in the class file, no matter > what V is. When you see a flattenable or scalarizable "V" > in the source code, you are looking at something new, > with new capabilities and new restrictions: A new contract. > > Even when we add in null-default types NV, NV? should still > translate to an L-descriptor (no L*) with legacy semantics. > > This may be a subtle contractual difference between NV? and > NV, which we still need to talk through. A field of type NV is > flattenable and must not be circular. A field of type NV? > is not flattenable, as far as the user can see, and may be > circular, if we take circularity as part of the inherent > expressiveness of L-types. That's the way I see it. > > We can make 99% of the difference between NV and NV? > disappear, starting with the observation that they have > the same value sets. But the last 1% will, I think, be > tricky to suppress. If we can make 100% of the > semantic differences between NV and NV? disappear, > then the distinct usage of legacy L-descriptors and > new L*-descriptors (G-descriptors in a forthcoming > document) will appear only as a distinction in translation > strategy, which will uphold the unified semantic contract > of NV and its alias NV?. I'm not sure we need a 'G' because NV is a property of the container too. NV is a value type + a null value check, i.e each time you call a method and NV has the encoding of the null value, a NPE is thrown. NV? is a nullable value type + a null value check, i.e. each time you call a method and NV? is null or NV? has the encoding of the null value, a NPE is thrown. Conceptually, a NV behave as if each instance method (resp field access) first checks if it's the encoding of null value before entering the method (before accessing the field). R?mi From john.r.rose at oracle.com Fri Apr 19 00:42:45 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 18 Apr 2019 17:42:45 -0700 Subject: Valhalla EG notes April 10, 2019 In-Reply-To: <2031733373.36331.1555632597830.JavaMail.zimbra@u-pem.fr> References: <34DC8D74-2976-425F-BF50-7258B633091D@oracle.com> <1A970A7B-AD56-4827-AE2E-B19221B39CBB@oracle.com> <979091f0-b85c-8840-ffff-8e60ea1ae21c@oracle.com> <87C145F1-65ED-40CC-A290-E3838BD36DFB@oracle.com> <451530882.30888.1555626812425.JavaMail.zimbra@u-pem.fr> <243D1E5D-0DF0-479E-B6A6-F67A1F8CAE8F@oracle.com> <2031733373.36331.1555632597830.JavaMail.zimbra@u-pem.fr> Message-ID: <6E038DA2-2220-43C5-91F1-7AE487C96FB9@oracle.com> On Apr 18, 2019, at 5:09 PM, forax at univ-mlv.fr wrote: > > I'm not sure we need a 'G' because NV is a property of the container too. The flattened layout is not a visible property of the container, if the container is typed using the legacy L-descxriptor. If the container is flattened, you need an extra signal ("L*-NV" not "L-NV"). (It took us four days last month to realize this and to discard a long list of workarounds.) The L*-NV descriptor (perhaps a descriptor augmented by a load-it-now side channel that carries the starts) gives the JVM permission to do the following extra steps. 1. Load NV.class when it lays out the container (of type L*-NV). 1a. Execute any class loader side effects due to that loading. 2. Throw a class circularity error if loading NV.class needs to lay out the container recursively. We generally agree to disregard effects in 1a. But the effect of 2 is a real departure from the standard contract of L-descriptors, and it needs a new contract. You could separate the "*" from "L*-NV" and instead put all the stars into a PreloadClasses attribute, as we once discussed. This is (a) too fragile to ensure robust flattening, and (b) not fine-grained enough, since some occurrences of "L*-NV" need the star, and others need to *omit* the star. This is why we are moving towards keeping the star in the descriptor, in effect. Physically, "L*" should be spelled with a single letter, of course; the working title was "Q" but is now "G" (meaning "go and look", since it applies to both inlines and templates). From forax at univ-mlv.fr Fri Apr 19 01:07:34 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 19 Apr 2019 03:07:34 +0200 (CEST) Subject: Valhalla EG notes April 10, 2019 In-Reply-To: <6E038DA2-2220-43C5-91F1-7AE487C96FB9@oracle.com> References: <34DC8D74-2976-425F-BF50-7258B633091D@oracle.com> <1A970A7B-AD56-4827-AE2E-B19221B39CBB@oracle.com> <979091f0-b85c-8840-ffff-8e60ea1ae21c@oracle.com> <87C145F1-65ED-40CC-A290-E3838BD36DFB@oracle.com> <451530882.30888.1555626812425.JavaMail.zimbra@u-pem.fr> <243D1E5D-0DF0-479E-B6A6-F67A1F8CAE8F@oracle.com> <2031733373.36331.1555632597830.JavaMail.zimbra@u-pem.fr> <6E038DA2-2220-43C5-91F1-7AE487C96FB9@oracle.com> Message-ID: <930194308.37490.1555636054170.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "John Rose" > ?: "Remi Forax" > Cc: "Brian Goetz" , "valhalla-spec-experts" > Envoy?: Vendredi 19 Avril 2019 02:42:45 > Objet: Re: Valhalla EG notes April 10, 2019 > On Apr 18, 2019, at 5:09 PM, forax at univ-mlv.fr wrote: >> >> I'm not sure we need a 'G' because NV is a property of the container too. > > The flattened layout is not a visible property of the container, > if the container is typed using the legacy L-descxriptor. > If the container is flattened, you need an extra signal ("L*-NV" > not "L-NV"). (It took us four days last month to realize > this and to discard a long list of workarounds.) > > The L*-NV descriptor (perhaps a descriptor augmented by a > load-it-now side channel that carries the starts) gives the JVM > permission to do the following extra steps. > > 1. Load NV.class when it lays out the container (of type L*-NV). > 1a. Execute any class loader side effects due to that loading. > 2. Throw a class circularity error if loading NV.class needs to lay out the > container recursively. > > We generally agree to disregard effects in 1a. But the effect of > 2 is a real departure from the standard contract of L-descriptors, > and it needs a new contract. > > You could separate the "*" from "L*-NV" and instead put all > the stars into a PreloadClasses attribute, as we once discussed. > This is (a) too fragile to ensure robust flattening, and (b) not > fine-grained enough, since some occurrences of "L*-NV" > need the star, and others need to *omit* the star. > > This is why we are moving towards keeping the star in the > descriptor, in effect. Physically, "L*" should be spelled with > a single letter, of course; the working title was "Q" but is > now "G" (meaning "go and look", since it applies to both > inlines and templates). I don't see an usage for 'G', '?' is for supporting codes that contains existing references to L. Introducing NV, we have to take care about existing 'L' aka NV?, it seems that no support is possible, so the VM should throw an IncompatibleClassChange. And as a user either there is an existing 'L' and you can use V? to be "backward compatible" or there is no existing refs to L and you can use V or NV. R?mi From forax at univ-mlv.fr Sat Apr 20 09:31:42 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 20 Apr 2019 11:31:42 +0200 (CEST) Subject: DevoxxFR poll Message-ID: <15205547.323748.1555752702728.JavaMail.zimbra@u-pem.fr> Hi all, during a university session at DevoxxFR last Wednesday, we have asked (me and Jos? Paumard) to the audience if they prefer the keyword 'inline' or 'immediate' to replace 'value' when declaring a 'value' type. Among the 264 persons that have submitted their answers using Kahoot, 211 votes for inline, 53 votes for immediate. Obviously, it's just a data point. R?mi From brian.goetz at oracle.com Sat Apr 20 15:43:10 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 20 Apr 2019 11:43:10 -0400 Subject: DevoxxFR poll In-Reply-To: <15205547.323748.1555752702728.JavaMail.zimbra@u-pem.fr> References: <15205547.323748.1555752702728.JavaMail.zimbra@u-pem.fr> Message-ID: <7850274F-E6D1-4F21-A287-6658F6734B59@oracle.com> Here?s what i take from this data point: that the term ?inline? is not immediately toxic to users. That is, when the concept is explained and given the name ?inline?, people don?t immediately choke on the name. That?s a good sign. > On Apr 20, 2019, at 5:31 AM, Remi Forax wrote: > > Hi all, > during a university session at DevoxxFR last Wednesday, we have asked (me and Jos? Paumard) to the audience if they prefer the keyword 'inline' or 'immediate' to replace 'value' when declaring a 'value' type. > > Among the 264 persons that have submitted their answers using Kahoot, > 211 votes for inline, > 53 votes for immediate. > > Obviously, it's just a data point. > > R?mi > > > > > > From karen.kinnear at oracle.com Wed Apr 24 13:08:14 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 24 Apr 2019 09:08:14 -0400 Subject: Valhalla EG meeting Weds 24th - short Message-ID: <526672AA-569B-4D4E-9C42-60124B04DBAF@oracle.com> Folks, The Oracle folks have an all-hands we need to attend, so we need to leave the meeting after 25 minutes. Apologies for the short notice. thanks, Karen From john.r.rose at oracle.com Wed Apr 24 18:00:55 2019 From: john.r.rose at oracle.com (John Rose) Date: Wed, 24 Apr 2019 11:00:55 -0700 Subject: a new job for : "static init factories" Message-ID: <84E83AF6-3D75-490D-8239-D7D54E71AAC8@oracle.com> We need a VM-level API point for running a constructor of an inline (by-value) type, which is distinct from the current way of running constructors. Currently, constructors are invoked by running invokespecial on a special internal method named , passing a blank fresh instance of the required class. The blankness of the fresh instance is tracked by the verifier using very special rules which are triggered by the mention of the very special name . We can't use this mechanism for constructing value types because there is no way to perform side effects on a value type instance, blank or not. The most natural way to create a value type instance, in the JVM, is to run invokestatic on an appropriately named static factory method. The name of this method is a convention which is known to various parties, notably the static compiler (javac) and reflection (jlr.Constructor). After some prototyping, I can say that it is a reasonable and simple thing to do to re-use the string "" for the target of an invokestatic which translates a constructor of an inline type. For example: ``` inline class Point { int x, y; Point(int x1, int y1) { x=x1; y=y1; } } // static Point.(II)Point { vdefault[Point] ? } class Client { static Point myPoint() { return new Point(3,4); } } // Client.myPoint() { ? invokestatic[Point.(II)Point] ? } ``` To do this, we will need to make some changes to the JVM specification (and implementations). Here are the changes I propose: * Relax constraints on CONSTANT_Methodref and CONSTANT_NameAndType allowing free use of as if it were a regular method name. * Retain all restrictions on use of via invokespecial. * Allow an invokestatic to mention (but no other bytecodes). * Retain all restrictions on definition of methods *in regular non-inline classfiles* * Allow an inline classfile to define a method, only with ACC_STATIC * Require that the type returned by such an method is the containing class. (Extra rider: If the class is non-denotable, aka. hidden, returned class must be Object.) These changes ensure that there are only two kinds of methods named , classic by-reference object constructors, and new by-value "static init factories". The specification ensures that these two kinds of methods, both named , can never be confused. The basic mechanism for ensuring this is that one kind of method is defined as non-static and the other is defined as static, and there is no way to accidentally invoke a static method via invokespecial, and vice versa for invokestatic. Note that CONSTANT_Methodrefs can refer to either kind of descriptor. This is not ambiguous, since every use of such a constant is coupled with an indicator (an opcode or ref-kind) which tells if the use is of a static method or not. Note also that there can be "crazy" references to the name such as under the type "()I" (no args returning int). These references are harmless because they will never link to a definition of that signature. This is true not because there are limitations on the form of uses of , but because there are strong limitations on the possible definitions of . A "crazy" use has no impact, other than causing an eventual linkage error. We could try to add more limits against crazy uses of , but that does not seem to be necessary, except perhaps as a "defense in depth" move. Additional changes are needed for reflection: * Allow new static init factories to be wrapped in `jlr.Constructor`s. * Ensure that `Constructor::newInstance` uses the right calling sequence for static init factories. * Ensure that `Lookup::findConstructor` can find static init factories. * Allow `Lookup::findStatic` to find static init factories. (This step simplifies mapping between invokestatic and method handle constants. It could be dropped.) * Ensure that resolution of `CONSTANT_MethodHandle` continues to use the reference-kind option correctly. (No actual spec change.) The low-level user model has a twist in it: At the level of bytecodes, a static init factory is just a vanilla static method, albeit with a funny name. But at the level of reflection, a static init factory appears to be a constructor. This is reasonable since a static init factory translates an actual constructor in source code. A `Constructor` which wraps a static init factory will have its `ACC_STATIC` method set, allowing users who care to distinguish the two uses of the name . The query method `Class::getDeclaredMethods` will not expose static init factories. Only `Class::getDeclaredConstructors` will. A similar rule applies to related API points such as `getMethod`. The purpose of this restriction (which is optional) is to avoid having overlaps in the reflected lists of constructors and methods. Because the `java.lang.invoke` API is lower-level than `java.lang.reflect`, it should (probably) be willing to treat static init factories as vanilla static methods. Thus `findStatic` can return a handle to such a factory. Because `java.lang.invoke` also integrates (via "unreflection") with `java.lang.reflect`, `findConstructor` should be willing to treat static init factories as constructors. Perhaps one of these methods can be suppressed, but it seems reasonable to allow both in `java.lang.invoke`, because of the two-sided positioning of that API layer. I posted a HotSpot POC implementation here: http://cr.openjdk.java.net/~jrose/jvm/JDK-8222787/ Comments, please? ? John From brian.goetz at oracle.com Thu Apr 25 14:40:34 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 25 Apr 2019 10:40:34 -0400 Subject: a new job for : "static init factories" In-Reply-To: <84E83AF6-3D75-490D-8239-D7D54E71AAC8@oracle.com> References: <84E83AF6-3D75-490D-8239-D7D54E71AAC8@oracle.com> Message-ID: <3870EA2B-BB50-4D8D-A05E-70EA71B51DF0@oracle.com> This all seems very sensible. > On Apr 24, 2019, at 2:00 PM, John Rose wrote: > > We need a VM-level API point for running a constructor > of an inline (by-value) type, which is distinct from the > current way of running constructors. Currently, constructors > are invoked by running invokespecial on a special internal > method named , passing a blank fresh instance of > the required class. The blankness of the fresh instance > is tracked by the verifier using very special rules which > are triggered by the mention of the very special name . > > We can't use this mechanism for constructing value types > because there is no way to perform side effects on a > value type instance, blank or not. > > The most natural way to create a value type instance, > in the JVM, is to run invokestatic on an appropriately > named static factory method. The name of this method > is a convention which is known to various parties, notably > the static compiler (javac) and reflection (jlr.Constructor). > > After some prototyping, I can say that it is a reasonable > and simple thing to do to re-use the string "" for > the target of an invokestatic which translates a constructor > of an inline type. For example: > > ``` > inline class Point { int x, y; Point(int x1, int y1) { x=x1; y=y1; } } > // static Point.(II)Point { vdefault[Point] ? } > class Client { static Point myPoint() { return new Point(3,4); } } > // Client.myPoint() { ? invokestatic[Point.(II)Point] ? } > ``` > > To do this, we will need to make some changes to the JVM > specification (and implementations). Here are the changes > I propose: > > * Relax constraints on CONSTANT_Methodref and CONSTANT_NameAndType > allowing free use of as if it were a regular method name. > * Retain all restrictions on use of via invokespecial. > * Allow an invokestatic to mention (but no other bytecodes). > * Retain all restrictions on definition of methods *in regular non-inline classfiles* > * Allow an inline classfile to define a method, only with ACC_STATIC > * Require that the type returned by such an method is the containing class. > (Extra rider: If the class is non-denotable, aka. hidden, returned class must be Object.) > > These changes ensure that there are only two kinds of methods > named , classic by-reference object constructors, and > new by-value "static init factories". The specification ensures > that these two kinds of methods, both named , can never > be confused. The basic mechanism for ensuring this is that > one kind of method is defined as non-static and the other is > defined as static, and there is no way to accidentally invoke > a static method via invokespecial, and vice versa for invokestatic. > > Note that CONSTANT_Methodrefs can refer to either kind of > descriptor. This is not ambiguous, since every use of such a > constant is coupled with an indicator (an opcode or ref-kind) > which tells if the use is of a static method or not. > > Note also that there can be "crazy" references to the name > such as under the type "()I" (no args returning int). > These references are harmless because they will never link > to a definition of that signature. This is true not because there > are limitations on the form of uses of , but because there > are strong limitations on the possible definitions of . > A "crazy" use has no impact, other than causing an eventual > linkage error. We could try to add more limits against crazy > uses of , but that does not seem to be necessary, except > perhaps as a "defense in depth" move. > > Additional changes are needed for reflection: > > * Allow new static init factories to be wrapped in `jlr.Constructor`s. > * Ensure that `Constructor::newInstance` uses the right calling sequence > for static init factories. > * Ensure that `Lookup::findConstructor` can find static init factories. > * Allow `Lookup::findStatic` to find static init factories. > (This step simplifies mapping between invokestatic and method > handle constants. It could be dropped.) > * Ensure that resolution of `CONSTANT_MethodHandle` continues to > use the reference-kind option correctly. (No actual spec change.) > > The low-level user model has a twist in it: At the level of bytecodes, > a static init factory is just a vanilla static method, albeit with a funny > name. But at the level of reflection, a static init factory appears to be > a constructor. This is reasonable since a static init factory translates > an actual constructor in source code. A `Constructor` which wraps > a static init factory will have its `ACC_STATIC` method set, allowing > users who care to distinguish the two uses of the name . > > The query method `Class::getDeclaredMethods` will not expose > static init factories. Only `Class::getDeclaredConstructors` will. > A similar rule applies to related API points such as `getMethod`. > The purpose of this restriction (which is optional) is to avoid > having overlaps in the reflected lists of constructors and methods. > > Because the `java.lang.invoke` API is lower-level than `java.lang.reflect`, > it should (probably) be willing to treat static init factories as vanilla > static methods. Thus `findStatic` can return a handle to such a factory. > Because `java.lang.invoke` also integrates (via "unreflection") with > `java.lang.reflect`, `findConstructor` should be willing to treat > static init factories as constructors. Perhaps one of these methods > can be suppressed, but it seems reasonable to allow both in > `java.lang.invoke`, because of the two-sided positioning of that > API layer. > > I posted a HotSpot POC implementation here: > http://cr.openjdk.java.net/~jrose/jvm/JDK-8222787/ > > Comments, please? > > ? John From john.r.rose at oracle.com Fri Apr 26 01:51:36 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 25 Apr 2019 18:51:36 -0700 Subject: a new job for : "static init factories" In-Reply-To: <84E83AF6-3D75-490D-8239-D7D54E71AAC8@oracle.com> References: <84E83AF6-3D75-490D-8239-D7D54E71AAC8@oracle.com> Message-ID: <850EC459-2B52-4CC5-A844-CD67DD63F32F@oracle.com> On Apr 24, 2019, at 11:00 AM, John Rose wrote: > > To do this, we will need to make some changes to the JVM > specification (and implementations). Here are the changes > I propose: > > * Relax constraints on CONSTANT_Methodref and CONSTANT_NameAndType > allowing free use of as if it were a regular method name. > * Retain all restrictions on use of via invokespecial. > * Allow an invokestatic to mention (but no other bytecodes). > * Retain all restrictions on definition of methods *in regular non-inline classfiles* > * Allow an inline classfile to define a method, only with ACC_STATIC > * Require that the type returned by such an method is the containing class. > (Extra rider: If the class is non-denotable, aka. hidden, returned class must be Object.) Something subtle here: Wherever we relax a restriction that affects a use of invokespecial, we usually have to impose it somewhere else. In particular, when the verifier processes a call to , it should (as a new responsibility) ensure that the called descriptor has a void return. An analogous check should look at CONSTANT_MH constants of kind newInvokeSpecial, to make sure the signature has a void return. This is the intention of "retain all restrictions" in the second point, which otherwise contradicts "relax constraints" in the first point. From brian.goetz at oracle.com Fri Apr 26 13:44:33 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 26 Apr 2019 09:44:33 -0400 Subject: Nullable types and inference Message-ID: I?ve been looking over Chapter 18 (thanks Dan!) and it seems that we are almost there to defining inference to work properly for values and erased generics. There is already a _null type_, and we?ve defined it so that for every reference type R, Null <: R (JLS 4.10.). In order to make inference work for nullable values, we need to state that for a zero-default value type V: Null (p); We gather bounds alpha <: Object (from the declaration of Box), Point <: alpha (from the argument), and Null <: alpha (T is an erased type var), yielding Null, Point <: alpha <: Object By 18.4, alpha = LUB(Point, Null) = Point?. Obviously this is only one example, and there?s a bunch of work to thread this all the way through Ch18 (good luck Dan!), but it seems to me that the underpinnings are here already. From brian.goetz at oracle.com Sun Apr 28 17:07:26 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 28 Apr 2019 13:07:26 -0400 Subject: Nullable types and inference In-Reply-To: References: Message-ID: <87E54ADF-3024-4F5D-818C-824DD6AE736B@oracle.com> More on this: we?ve ambled our way into a nice and consistent model for ??? types. Which is, T? is the union type of T and Null. This works for nullable values: (V? where V is a zero-default value type), type variables (T? means T union Null), and also for nullable type patterns (the pattern Foo? matches any non-null Foo, or null.). This is not necessarily what users will immediately think T? means, especially if they?ve got experience with type systems where ? is an arity indicator meaning ?zero or one? (such as X#/C_omega). But its an explainable and stable thing, and the fact that we independently came to this from two directions (values and patterns) is an encouraging indicator. > On Apr 26, 2019, at 9:44 AM, Brian Goetz wrote: > > I?ve been looking over Chapter 18 (thanks Dan!) and it seems that we are almost there to defining inference to work properly for values and erased generics. > > There is already a _null type_, and we?ve defined it so that for every reference type R, Null <: R (JLS 4.10.). In order to make inference work for nullable values, we need to state that for a zero-default value type V: > > Null Null <: V? > > and that LUB(V, Null) = V? > > When we gather constraints in 18.1, in addition to adding the upper bound on alpha, we also add in lower bounds Null <: alpha_i for erased type vars. > > We adjust 18.4 to not consider Null to be a proper lower bound for purposes of resolution. > > Simple example: > > Point p; > var v = new Box<>(p); > > We gather bounds alpha <: Object (from the declaration of Box), Point <: alpha (from the argument), and Null <: alpha (T is an erased type var), yielding > > Null, Point <: alpha <: Object > > By 18.4, alpha = LUB(Point, Null) = Point?. > > Obviously this is only one example, and there?s a bunch of work to thread this all the way through Ch18 (good luck Dan!), but it seems to me that the underpinnings are here already. > > From forax at univ-mlv.fr Mon Apr 29 16:53:05 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 29 Apr 2019 18:53:05 +0200 (CEST) Subject: Nullable types and inference In-Reply-To: <87E54ADF-3024-4F5D-818C-824DD6AE736B@oracle.com> References: <87E54ADF-3024-4F5D-818C-824DD6AE736B@oracle.com> Message-ID: <59103793.210065.1556556785619.JavaMail.zimbra@u-pem.fr> I will rain on your parade. Here, i'm talking about Java the language, not about the VM support of L/Q-types. V? is a use site annotation and we all know that use site annotations are the devil in disguise, wildcards is a good example of a use site feature nobody understand/want to understand, so the bar to introduce such kind of annotation is the highest bar that exists to introduce something in Java. so why do we need V?, we need it - to represent the L variation of a Q type. - to interact with generics that are not reified. The first case is a corner case and for generics, it works until someone try to stuff null into a value type. So instead introducing nullable value types in the language which make the language far more complex than it should be, i think we should come up with a far simpler proposal, to have a declaration site tagging of the method that doesn't work with value types. // proposed syntax interface Map { "this method doesn't work if V is a value type" public V get(Object o); } R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "valhalla-spec-experts" > Envoy?: Dimanche 28 Avril 2019 19:07:26 > Objet: Re: Nullable types and inference > More on this: we?ve ambled our way into a nice and consistent model for ??? > types. Which is, T? is the union type of T and Null. This works for nullable > values: (V? where V is a zero-default value type), type variables (T? means T > union Null), and also for nullable type patterns (the pattern Foo? matches any > non-null Foo, or null.). > > This is not necessarily what users will immediately think T? means, especially > if they?ve got experience with type systems where ? is an arity indicator > meaning ?zero or one? (such as X#/C_omega). But its an explainable and stable > thing, and the fact that we independently came to this from two directions > (values and patterns) is an encouraging indicator. > >> On Apr 26, 2019, at 9:44 AM, Brian Goetz wrote: >> >> I?ve been looking over Chapter 18 (thanks Dan!) and it seems that we are almost >> there to defining inference to work properly for values and erased generics. >> >> There is already a _null type_, and we?ve defined it so that for every reference >> type R, Null <: R (JLS 4.10.). In order to make inference work for nullable >> values, we need to state that for a zero-default value type V: >> >> Null > Null <: V? >> >> and that LUB(V, Null) = V? >> >> When we gather constraints in 18.1, in addition to adding the upper bound on >> alpha, we also add in lower bounds Null <: alpha_i for erased type vars. >> >> We adjust 18.4 to not consider Null to be a proper lower bound for purposes of >> resolution. >> >> Simple example: >> >> Point p; >> var v = new Box<>(p); >> >> We gather bounds alpha <: Object (from the declaration of Box), Point <: >> alpha (from the argument), and Null <: alpha (T is an erased type var), >> yielding >> >> Null, Point <: alpha <: Object >> >> By 18.4, alpha = LUB(Point, Null) = Point?. >> >> Obviously this is only one example, and there?s a bunch of work to thread this >> all the way through Ch18 (good luck Dan!), but it seems to me that the >> underpinnings are here already. >> From brian.goetz at oracle.com Mon Apr 29 19:51:22 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 29 Apr 2019 15:51:22 -0400 Subject: Nullable types and inference In-Reply-To: <59103793.210065.1556556785619.JavaMail.zimbra@u-pem.fr> References: <87E54ADF-3024-4F5D-818C-824DD6AE736B@oracle.com> <59103793.210065.1556556785619.JavaMail.zimbra@u-pem.fr> Message-ID: <23850128-A2ED-4A41-A87D-3233C9F3D5FE@oracle.com> > The first case is a corner case and for generics, it works until someone try to stuff null into a value type. > > So instead introducing nullable value types in the language which make the language far more complex than it should be, i think we should come up with a far simpler proposal, to have a declaration site tagging of the method that doesn't work with value types. > > > // proposed syntax > interface Map { > "this method doesn't work if V is a value type" public V get(Object o); > } We explored this idea in M3; we jokingly called this ?#ifref?, which is to say, we would restrict members to the reference specialization. This was our first attempt at dealing with this problem. We gave up on this for a number of reasons, not least of which was that it really started to fall apart when you had more than one type variable. But it was a hack, and only filtered out the otherwise-unavoidable NPEs. More generally, there?s lots of generic code out here that assumes that null is a member of the value set of any type variable T, that you can stuff nulls in arrays of T, etc. Allowing users to instantiate arbitrary generics with values and hope for no NPEs (or expect authors of all those libraries to audit and annotate their libraries) is not going to leave developers with a feeling of safety and stability. Further, allowing users to instantiate an ArrayList ? even when the author of ArrayList proposes up and down (including on behalf of all their subtypes!) that it won?t stuff a null into T ? will cause code to silently change its behavior (and maybe its descriptor) when ArrayList is later specialized. This puts pressure on our migration story; we want the migration of ArrayList to be compatible, and that means that things don?t subtly break when you recompile them. Using ArrayList today means that even when ArrayList is specialized, this source utterance won?t change its semantics. Essentially, erased generics type variables have an implicit bound of ?T extends Nullable?; the migration from erased to specialized is what allows the declaration to drop the implicit bound, and have the compiler type-check the validity of it. We have four choices here: - Don?t allow erased generics to be instantiated with values at all. This sucks so badly we won?t even discuss it. - Require generics to certify their value-readiness, which means that their type parameters are non nullable. This risks degenerating into the first, and will be a significant impediment to the use and adoption of values. - Let users instantiate erased generics with values, and let them blow up when the inevitable null comes along. That?s what you?re proposing. - Bring nullity into the type system, so that we can accurately enforce the implicit constraint of today?s erased generics. That?s what I?m proposing. I sympathize with your concern that this is adding a lot of complexity. Ultimately, though, I don?t think just letting people blindly instantiate generics that can?t be proven to conform to their bounds is not helping users either. Better suggestions welcome! (A related concern is that V? looks too much like ? extends V, especially in the face of multiple tvars: Map. This may have a syntactic solution.). From forax at univ-mlv.fr Mon Apr 29 22:01:18 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 30 Apr 2019 00:01:18 +0200 (CEST) Subject: Nullable types and inference In-Reply-To: <23850128-A2ED-4A41-A87D-3233C9F3D5FE@oracle.com> References: <87E54ADF-3024-4F5D-818C-824DD6AE736B@oracle.com> <59103793.210065.1556556785619.JavaMail.zimbra@u-pem.fr> <23850128-A2ED-4A41-A87D-3233C9F3D5FE@oracle.com> Message-ID: <860449609.229645.1556575278667.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "valhalla-spec-experts" > Envoy?: Lundi 29 Avril 2019 21:51:22 > Objet: Re: Nullable types and inference >> The first case is a corner case and for generics, it works until someone try to >> stuff null into a value type. >> >> So instead introducing nullable value types in the language which make the >> language far more complex than it should be, i think we should come up with a >> far simpler proposal, to have a declaration site tagging of the method that >> doesn't work with value types. >> >> >> // proposed syntax >> interface Map { >> "this method doesn't work if V is a value type" public V get(Object o); >> } > > We explored this idea in M3; we jokingly called this ?#ifref?, which is to say, > we would restrict members to the reference specialization. This was our first > attempt at dealing with this problem. We gave up on this for a number of > reasons, not least of which was that it really started to fall apart when you > had more than one type variable. But it was a hack, and only filtered out the > otherwise-unavoidable NPEs. > > More generally, there?s lots of generic code out here that assumes that null is > a member of the value set of any type variable T, that you can stuff nulls in > arrays of T, etc. Allowing users to instantiate arbitrary generics with values > and hope for no NPEs (or expect authors of all those libraries to audit and > annotate their libraries) is not going to leave developers with a feeling of > safety and stability. To get a NPEs when you store something in an array of T, you need the array to be an array of inline type, and currently it's not that easy because T is not reified. There are two usual way to have reified array/data structure currently, either you send a Class value, like Collection.checkedCollection() and it's fine because the user is in control or you do something a little more fancy to transform the type signature emitted by javac to a series of Class, like Jackson/Spring does, and yes, these libraries will need to be fixed to choose the L-view of an inline class. > > Further, allowing users to instantiate an ArrayList ? even when the > author of ArrayList proposes up and down (including on behalf of all their > subtypes!) that it won?t stuff a null into T ? will cause code to silently > change its behavior (and maybe its descriptor) when ArrayList is later > specialized. This puts pressure on our migration story; we want the migration > of ArrayList to be compatible, and that means that things don?t subtly break > when you recompile them. Using ArrayList today means that even when > ArrayList is specialized, this source utterance won?t change its semantics. yes, you're right, but the downsize is that you have introduce V? in the type system and everybody has no to think if he should choose V or V? for every types. Java has get ride of passing struct by value or ref because it simplifies the model, introducing V? creates such rift which i agree is comfortable for a designer because we don't break anything but at the same time we are offering to many options for something that will be seen as a subtle semantics issue for most of our users. > > Essentially, erased generics type variables have an implicit bound of ?T extends > Nullable?; the migration from erased to specialized is what allows the > declaration to drop the implicit bound, and have the compiler type-check the > validity of it. We don't need V? if generics are fully reified so introducing V? 20 years from now will look stupid. > > We have four choices here: > > - Don?t allow erased generics to be instantiated with values at all. This sucks > so badly we won?t even discuss it. > - Require generics to certify their value-readiness, which means that their type > parameters are non nullable. This risks degenerating into the first, and will > be a significant impediment to the use and adoption of values. but nothing break in this mode. > - Let users instantiate erased generics with values, and let them blow up when > the inevitable null comes along. That?s what you?re proposing. until people annotate the method/class that doesn't support value type. > - Bring nullity into the type system, so that we can accurately enforce the > implicit constraint of today?s erased generics. That?s what I?m proposing. which doesn't scale because your asking all your users to think as API developers, not something i want to do when i just want to use an API. > > > I sympathize with your concern that this is adding a lot of complexity. > Ultimately, though, I don?t think just letting people blindly instantiate > generics that can?t be proven to conform to their bounds is not helping users > either. Better suggestions welcome! I believe that the issue is that V? should work as a box and currenty V? is to powerful/useful as a box so people will start to use it as a true type. - V? should not be called V?, it's to short, you should have some ceremony that show you that V is the real deal, V.box was better - V? should not be a supertype of V, again, it makes the box to powerful. With that peope wil start to use V? as parameter type like we are using List instead of ArrayList - you should not be able to call methods or fields on V? (constructor are still allowed), again it should be a box, so the implicit conversion from/to V/V? is fine, but not more. the moto for V? should be works like an Integer (not more). > > (A related concern is that V? looks too much like ? extends V, especially in the > face of multiple tvars: Map. This may have a syntactic > solution.). R?mi From brian.goetz at oracle.com Mon Apr 29 22:28:05 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 29 Apr 2019 18:28:05 -0400 Subject: Nullable types and inference In-Reply-To: <860449609.229645.1556575278667.JavaMail.zimbra@u-pem.fr> References: <87E54ADF-3024-4F5D-818C-824DD6AE736B@oracle.com> <59103793.210065.1556556785619.JavaMail.zimbra@u-pem.fr> <23850128-A2ED-4A41-A87D-3233C9F3D5FE@oracle.com> <860449609.229645.1556575278667.JavaMail.zimbra@u-pem.fr> Message-ID: <26ac59b4-4f0e-c8c3-49bf-a2fc419166d3@oracle.com> Like I said, I sympathize with your position; you're saying "do we really need this?", which is the right question to be asking. But, I do believe you're wishing away a lot of stuff in the desire to get to the answer you want, and that's where we disagree.? I would be happy to find another way to get there, if there is one, but "pretend the problem doesn't exist" doesn't sit very well right now. More inline. >>> The first case is a corner case and for generics, it works until someone try to >>> stuff null into a value type. >>> >>> So instead introducing nullable value types in the language which make the >>> language far more complex than it should be, i think we should come up with a >>> far simpler proposal, to have a declaration site tagging of the method that >>> doesn't work with value types. >>> >>> >>> // proposed syntax >>> interface Map { >>> "this method doesn't work if V is a value type" public V get(Object o); >>> } >> We explored this idea in M3; we jokingly called this ?#ifref?, which is to say, >> we would restrict members to the reference specialization. This was our first >> attempt at dealing with this problem. We gave up on this for a number of >> reasons, not least of which was that it really started to fall apart when you >> had more than one type variable. But it was a hack, and only filtered out the >> otherwise-unavoidable NPEs. >> >> More generally, there?s lots of generic code out here that assumes that null is >> a member of the value set of any type variable T, that you can stuff nulls in >> arrays of T, etc. Allowing users to instantiate arbitrary generics with values >> and hope for no NPEs (or expect authors of all those libraries to audit and >> annotate their libraries) is not going to leave developers with a feeling of >> safety and stability. > To get a NPEs when you store something in an array of T, you need the array to be an array of inline type, > and currently it's not that easy because T is not reified. Or, return a null from any method that returns a T.?? And this is the problem; in order to safely write such generic code, this has to be a property of the _method declaration_, not just something the implementation doesn't happen to do right now.? Which brings us to either: ?- Use nullable types at the instantiation, or ?- Force all generics to declare their nullability / non-nullability >> Further, allowing users to instantiate an ArrayList ? even when the >> author of ArrayList proposes up and down (including on behalf of all their >> subtypes!) that it won?t stuff a null into T ? will cause code to silently >> change its behavior (and maybe its descriptor) when ArrayList is later >> specialized. This puts pressure on our migration story; we want the migration >> of ArrayList to be compatible, and that means that things don?t subtly break >> when you recompile them. Using ArrayList today means that even when >> ArrayList is specialized, this source utterance won?t change its semantics. > yes, you're right, but the downsize is that you have introduce V? in the type system and everybody has no to think if he should choose V or V? for every types. Yes, that's the cost.? And it's actually worse than you say, because on Day One, you will only be able to say ??? new ArrayList and not ??? new ArrayList and users will bitch and moan about "why can't the stupid compiler figure out what I want", because they are unaware of the _future_ ambiguities they would be subject to if they did that. The alternative is to merge L10/L100, and no one gets values for a long time more, which also sucks. >> Essentially, erased generics type variables have an implicit bound of ?T extends >> Nullable?; the migration from erased to specialized is what allows the >> declaration to drop the implicit bound, and have the compiler type-check the >> validity of it. > We don't need V? if generics are fully reified so introducing V? 20 years from now will look stupid. There's a kernel of a point here, but you've overstated it by a hundred million billion percent :) V? remains useful, but given the choice between parameterizing over V or V?, _most of the time_ users will probably choose V.? That doesn't mean V? is stupid; it's just not the most common case. Which is _exactly why_ we gave "V" the good syntax and "V?" the less good syntax, so that when we get to that world, we won't have chosen the wrong default (again.) >> We have four choices here: >> >> - Don?t allow erased generics to be instantiated with values at all. This sucks >> so badly we won?t even discuss it. >> - Require generics to certify their value-readiness, which means that their type >> parameters are non nullable. This risks degenerating into the first, and will >> be a significant impediment to the use and adoption of values. > but nothing break in this mode. Nothing breaks, but values can't be used with most generics for the next ten years.? That leads to "value types are useless", wich we don't want. > >> - Let users instantiate erased generics with values, and let them blow up when >> the inevitable null comes along. That?s what you?re proposing. > until people annotate the method/class that doesn't support value type. Sorry, but I can't take this option seriously.? Imagine all the generic code in all the world.? Now imagine all the classes that might, in some case, cause a null to show up in a T.? (This includes every generic interface, as there is always a potential subtype that returns null from a T-bearing method.)? Call this latter category B (for Bad).? Now: what percentage of B do you think will be prperly annotated after a month?? A year?? Ten years?? 99.9999%?? No.? 99%? No.? Probably closer to 10%. Which reverts to: "try it and see if it blows up."? This is not the safe programming experience we want people to have. > >> - Bring nullity into the type system, so that we can accurately enforce the >> implicit constraint of today?s erased generics. That?s what I?m proposing. > which doesn't scale because your asking all your users to think as API developers, not something i want to do when i just want to use an API. So far, this is the only acceptable approach we've found.? If you can find a better one, we're all ears! > I believe that the issue is that V? should work as a box and currenty > V? is to powerful/useful as a box so people will start to use it as a > true type. This is a valid discussion (start a new thread).? Basically: we've gotten rid of boxes because they are semantically sketchy (accidental identity) and harder to optimize, but you're saying: people understand boxes, they don't understand this T or null thing?? Fair point: let's discuss this (on a new thread.) > - V? should not be called V?, it's to short, you should have some ceremony that show you that V is the real deal, V.box was better If it really were a box, then V.box was better.? But it is not like the boxes that people understand, so V.box is an utterly terrible name for the semantics we have (and people's understanding of boxes.)? But, we are open to another way to write it, if you dislike V? so much. From daniel.smith at oracle.com Tue Apr 30 21:22:14 2019 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 30 Apr 2019 15:22:14 -0600 Subject: Nullable types and inference In-Reply-To: References: Message-ID: > On Apr 26, 2019, at 7:44 AM, Brian Goetz wrote: > > There is already a _null type_, and we?ve defined it so that for every reference type R, Null <: R (JLS 4.10.). In order to make inference work for nullable values, we need to state that for a zero-default value type V: > > Null Null <: V? > > and that LUB(V, Null) = V? > > When we gather constraints in 18.1, in addition to adding the upper bound on alpha, we also add in lower bounds Null <: alpha_i for erased type vars. > > We adjust 18.4 to not consider Null to be a proper lower bound for purposes of resolution. I agree with the spirit of this, but I've actually been meaning for awhile to push on getting rid of the null type, which is barely a type (like needing special treatment in 18.4). The goal would be to treat 'null' as a poly expression instead. That aside, yes, we'll need a way to identify which inference vars are constrained to be nullable (and maybe which are constrained to be non-nullable?). If it's not with a type, we can do it with a special-purpose bound and corresponding resolution rules.