value type hygiene
forax at univ-mlv.fr
forax at univ-mlv.fr
Thu May 10 21:34:40 UTC 2018
> De: "daniel smith" <daniel.smith at oracle.com>
> À: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "John Rose" <john.r.rose at oracle.com>, "valhalla-spec-experts"
> <valhalla-spec-experts at openjdk.java.net>
> Envoyé: Jeudi 10 Mai 2018 18:54:48
> Objet: Re: value type hygiene
>> On May 10, 2018, at 3:11 AM, Remi Forax < [ mailto:forax at univ-mlv.fr |
>> forax at univ-mlv.fr ] > wrote:
>> Q-Type (if the roots is j.l.Object + interfaces) and having a ValueTypes
>> attributes are two different encoding of the same semantics, either the
>> descriptor is a Q-type or the descriptor is a L-type and you have a side table
>> that says it's a Q-type.
> Yes, with some huge caveats attached to the attribute strategy:
> - You have to pick one mode for all types of a given class in your class file
Attaching this attribute on a method makes little sense since you compile/re-compile a java class as a whole.
BTW, attaching the attribute to the nest host is not a good idea even if it's what is the closest to a compilation unit because it will trigger the classloading of the nest host very early since the verifier will use this attribute.
> - The semantics are indirect; people will get used to reading them as a property
> of the class name, when in reality they're a property of a side attribute
> ("Debugging: I know Foo is a value class, so why is this null slipping
> through?...")
here, we are talking about the people that read bytecode, javap can be patched to show in the signature what is considered as a value type or not.
> - Descriptor equality is redefined so that non-equal descriptors match (that is,
> where one descriptor uses a Q type and one uses an L type); adaptations are
> necessary to make mismatched descriptors cooperate
yes, see John's mail about trying to do the adaptation while keeping one vtable slot.
> - We'll probably try very hard to present users with the fiction that there is
> only one type (e.g., in reflection)
yes, you do not want to show them bridges/ValueTypes attribute unless they want to know.
>> The main difference between the two encodings is that you have to generate
>> bridges in case of Q-type.
>> Generating bridges in general is far from obvious (that's why invokedynamic to
>> the adaptation at caller site btw), you need a subtype relation, like String <:
>> T for generics, if you do not have a subtype relationship you can not generate
>> bridges.
>> For value types, QFoo <: LFoo is not what we need, by example, we want the
>> following example to work,
>> let say i have:
>> class A {
>> void m(LFoo)
>> }
>> class B extends A {
>> void m(LFoo)
>> }
>> Foo is now declared as value type, and now i recompile B
>> class B extends A {
>> void m(QFoo)
>> }
>> if i call A::m, i want B::m to be valid at runtime, so QFoo has also to be a
>> super type of LFoo.
>> so the relation between QFoo and LFoo is more like auto-boxing, you have QFoo <:
>> LFoo but you also have QFoo <: LFoo because of the separate compilation issue,
>> and if you do not have a subtyping relationship between types, you can not
>> generate bridges.
> Tentatively, the bridge generation strategy I envision looks like this:
> - When I convert a class to a value class, I annotate it ("@WasAReferenceClass")
> - When a descriptor mentions a Q type, the compiler also generates an L bridge
> There are problems with this: for example, when mentioning n distinct Q types,
> you need 2^n bridges. And maybe there are things the JVM can do to help—we've
> explored lots of general-purpose "this class has moved" features. My preference
> is to tackle those problems as needed, on their own terms.
> But, yes, I'll grant that probably having the JVM totally ignore the problem
> ultimately won't work.
People will publish articles showing that a few line of Java can generate very big vtable at runtime, like they were several articles on exponential verification time before the split verifier.
>>> - The JVM "knows" internally about the two kinds of types, but we won't give
>>> users the ability to directly express them, or inspect them with reflection.
>>> That mismatch seems bound to bite us repeatedly.
>> The fact that Java the language surface if a type is a value type or not is a
>> language issue and it's true for both encoding.
>> For the refection, at runtime, you now if a class is a value type or not, the
>> same is true for both encoding.
>> If you mean, that at runtime, you can not see if a method was compiled with the
>> knowledge that a type is a value type or not, again,
>> it depends if you surface Q-type or the ValueTypes attributes at runtime, so
>> this choice is independent of the encoding.
> The reflection question boils down to: are there two java.lang.Class objects per
> value class, or one? My read of the goals here is that we'd very much like for
> there to be only one, for the same reason that we'd like to not change the
> spelling of descriptors. In that world, I think it will be hard to reason about
> where null checks happen. (Sure, maybe you can figure it out by consulting the
> ValueTypes attributes, but that's a huge pain.)
the VM has to consult it to generate a NPE so the VM can emit an error message that say which value type is expected to be not null.
>>> - We talk a lot about nullability being a migration problem, but it is sometimes
>>> just a really nice feature! All things being equal, not being able to freely
>>> talk about nullable value types is limiting.
>> again, it's a language thing, it's the same issue for both encoding.
> I don't buy this. If the JVM doesn't give me (a compiler writer) the direct
> ability to talk about nullable value types, I can maybe work around that. But
> there will be seams. It will be confusing. Debugging will be messy.
Separate compilation issues are always messy but if the NPE has a specific error message, we are a stackoverflow entry away from people being able to debug their problem.
>> So the question is more, should we allow to retrofit a reference type to be a
>> value type seamlessly,
>> if the answer is yes, then QFoo <: LFoo is not enough so we can not use Q-type
>> but we can use a side table,
>> if the answer is no, then QFoo <: LFoo is ok, we permit to retrofit a L-type to
>> a Q-type, but user code as to wait that all its dependencies have been updated
>> to use the Q-type before being able to use it.
> For the "answer is no" case: in the scenario where I've started using QFoo, but
> a library still uses LFoo, what can I do?
> - Subtyping still works, so I can pass QFoos in. Great!
> - When I get LFoos out, I will want to null check them and convert to QFoo.
> Fine.
> - A QFoo[] that I pass in will reject nulls. Which is to be expected. If the
> semantics demand nullability, I should use an LFoo[] instead.
> - Similar with objects I pass in that have LFoo->QFoo bridge methods: there will
> be null checks, if that's a problem, the objects shouldn't operate on QFoos.
> - No new identities or boxes get created. It's the same values passing between
> the two APIs.
> - The library doesn't get the flattening benefits. It needs to make a choice to
> opt in to them first.
> This seems like a fine picture. Ideally, it envisions a language that gives some
> fine-grained control over whether "Foo" means QFoo or LFoo. Maybe we'll provide
> that ability in Java—I don't know. It's nice if the JVM gives languages the
> ability to make that choice.
You can do everything you list apart the first item if the answer is no, Q-type and the attribute ValueTypes are two encodings of the same problem. The idea of the ValueType attribute is to encode the knowledge about value types of the compiler at the time the code was compiled so the VM can do the adaptations/checks needed to allow users to upgrade a reference type to a value type.
Rémi
More information about the valhalla-spec-observers
mailing list