Reified generics - shadow class edition
Remi Forax
forax at univ-mlv.fr
Tue Sep 18 10:37:45 UTC 2018
Errata, the method Classy.fieldType() should return a Class not a field descriptor.
interface Classy {
Class<?> superclass();
Class<?>[] interfaces();
Class<?> fieldType(String field, String descriptor);
MethodHandle method(String name, String descriptor);
}
Rémi
----- Mail original -----
> De: "Remi Forax" <forax at univ-mlv.fr>
> À: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Envoyé: Mardi 18 Septembre 2018 12:32:16
> Objet: Reified generics - shadow class edition
> Reified generics - shadow class edition.
>
> I believe that trying to make method descriptor variant is a bad idea, it comes
> from the model 1...3 experimentation but it's an artifact of such
> implementations, not a concept.
> Here i describe a way to keep generics erased even if they are reified.
>
> If the descriptor is erased, we need a way to get reified type argument at
> runtime so you can use 'checkcast' to verify that the arguments that are
> parameterized have the right type.
> By example
>
> class Holder<any E> {
> E element;
>
> <when E!= void> // can be specialized to throw a NoSuchMethodError if E is void
> E get() {
> return element;
> }
>
> <when E!= void>
> void set(E element) {
> this.element = element;
> }
> }
> will be translated into
> class Holder<any E> {
> Object element;
>
> Object get() {
> return element;
> }
> void set(Object element) {
> element checkcast Es
> // verify that element is the type argument of E here
> this.element = element;
> }
> }
>
> Now to bridge the gap, we also need:
> - a way to explain to the VM at runtime that the field 'element' is specialized
> (if it's a value type)
> - a way explain to the VM at runtime that the methods get and set has different
> implementations
>
> For that i proposed a new mechanism in the VM called master class/shadow class
> which is a way define a specialized class, the shadow class, from a template
> class, the master class. In my example, Holder is the master class and
> Holder<Complex> with Complex a value type is a shadow class 'derived' at
> runtime from the master class.
>
> This mechanism is more general than just supporting type specialization in the
> VM because
> - we do not want to inject the Java generic semantics in the VM or the Scala
> semantics, or the Kotlin semantics, etc.
> - we can support more use cases, so other languages can by example associate a
> constant an int to a class like in C++.
>
> So the idea is introduce two things that works together:
> 1) implement in the VM a mechanism that allows to add constant objects as
> supplementary values (class data) when defining a class
> 2) use a bootstrap method (to "go meta" as John said) to allow to specialize
> fields and methods of such class
>
> Those two features may be cleanly separated in the future, but i'm not sure how
> to do that, so for now, let say they are two parts of the same feature, the
> master class/shadow class feature.
>
> For (1), we need a class file attribute that describe the class of each class
> data, we don't need to name them, it can be positional (for java generics we
> may introduce another attribute or re-use one existing to find the name of the
> class data if they are type parameter).
> For (2), we need to specify a boostrap method that will be called to describe
> how the specialization should be done.
>
> Considering (1) and (2) as a unique feature means you can have the same class
> attribute definining the class data and the boostrap method.
> The MasterClass attribute
>
> MasterClass_attribute
> u2 attribute_name_index;
> u4 attribute_length;
> u2 number_of_class_data;
> {
> u2 descriptor
> u2 default_value
> } class_data[number_of_class_data]
> u2 bootstrap_method_attr_index;
> u2 name_and_type_index;
> }
>
> The class data descriptor is a field descriptor that describes the class of the
> class_data, it should be a class among int, long, float, double, String,
> MethodType, MethodHandle, i.e. the type of the constant that can appear in the
> constant pool.
> The default value is a constant pool item that defines the value that will be
> used if the shadow class is created with no class_data.
> The bootstrap method is called to derive a shadow class from a master class if
> the shadow class has not yet been created yet. The bootstrap method takes a
> Lookup configured on the master class, a name, a Class (the type of the
> name_and_type) and an array of Object containing the class data as parameter (+
> some eventual boostrap arguments) and returns a reference to the
> java.lang.invoke.Classy.
> The type of the name_and_type as to be a subtype of java.lang.invoke.Classy.
>
> The interface java.lang.invoke.Classy describes how to specialize a shadow class
> from a master class.
> interface Classy {
> Class<?> superclass();
> Class<?>[] interfaces();
> String fieldDescriptor(String field, String descriptor);
> MethodHandle method(String name, String descriptor);
> }
>
> superclass() returns the super-class on the shadow class, it has to be a
> specialization of master class super-class (a subtype of the master class
> super-class) or the master class super-class it self.
> interfaces() return the interfaces of the shadow class, each interface has to be
> a subtype of the master class corresponding interface or the corresponding
> interface itself.
> fieldDescriptor() is called for each field of the master class, with the field
> name and the field descriptor the master class, this method returns the field
> descriptor of corresponding field of the shadow class, it must be a subtype of
> the master class field. If null is returns, it means the field doesn't exist
> and a NoSuchFieldError will be thrown upon access.
> method() is called for each method of the master class, with the method name and
> its method descriptor, this method returns a method handle corresponding to the
> specialization of the master class method in the shadow class. The method
> handle type as to be exactly the same as the descriptor sent as parameter. If
> null is returns, it means the method doesn't exist and a NoSuchMethodError will
> be thrown upon access.
>
> The idea here is that a shadow class is a covariant variant of the master class,
> a field can be replaced by a subtype, a method can be replaced by a specialized
> variant with the same parameter types. This allow any shadow call to be
> accessed using any opcode that takes the master class as owner, getfield,
> putfield, all invoke* opcodes. For getfield, a value-type can be buffered by
> the VM to Object/an interface. For putfield, the VM as to perform an extra
> check at runtime (like there is an extra check for arraystore because arrays
> are covariant).
>
> The interface Classy can be used by the VM at anypoint in time, so calls to
> method can be lazy or not (the other informations are needed to determine the
> layout of a class so they can not be called lazily).
>
> At runtime, for the VM, an instance of a shadow class is a subtype of a master
> class.
>
> The fact that the shadow class is a subtype of the master class allows to
> desugar wildcards in Java as the master class.
> A shadow class has no special encoding in the bytecode, it only has a
> representation in the runtime data structure of the VM.
>
> In order to be be backward compatible, java.lang.Class is extended to also
> represents shadow classes, java.lang.Class is extended by the following
> methods:
> - Class<?> withClassData(Object... data) that returns the shadow class of a
> master class.
> - Object[] getClassData() that returns the class data of a shadow class or null.
> - boolean isMasterClass() return if current class is a master class.
> - Class<?> getMasterclass() that returns the master class of a shadow class or
> the current class otherwise (a classical class is it's own master class).
>
> Reusing java.lang.Class to represent shadow classes at runtime is important
> because it allows reflection and java.lang.invoke to works seamlessly with the
> shadow class because from a user point of view, a classical class and a shadow
> class are all java.lang.Class.
>
> There is a compatibility issue with Object.getClass(), isInstance, instanceof
> and checkcast, they can not can not returns/uses the shadow class because a
> code like this o.getClass() == ArrayList.class or o instanceof ArrayList will
> not work if the comparison uses the shadow class. This means that getClass(),
> instanceof and checkcast need to check the master class of the shadow class
> instead of using the shadow class directly.
> Note that this problem is not inherent to the shadow class, it's an artifact of
> the fact that the type argument is reified.
>
> This means that we have to introduce a least a supplementary methods for
> getClass(), a static method in class, Class.getTheTrueRealClass(Object o) is
> enough, it also means that if we want to allow reified cast/instanceof in
> Java/.class notation, this will have to be implemented using
> invokedynamic/condy (again to avoid to avoid to bolt the Java generics
> semantics in the VM). We may also choose to not support reified cast/instanceof
> in Java, given that being able to specialized fields/methods is more important
> in term of performance and that we will not support reified generics of objects
> anyway.
>
>
> The fact that a shadow class has a representation in the classfile means that we
> are loosing information because if ArrayList is anyfied,
> ArrayList<String> list = ...
> list.get(3)
> list.get() is encoded in the bytecode as a calls to the master class ArrayList
> and not a class to the shadow class ArrayList, so a call to an anyfied generics
> is still erased, but given that this information is available at runtime (the
> inlining cache stores the shadow class), a JIT can easily inline the call.
>
> With the classfile only containing classical descriptor, in term of opcodes we
> need only to add to support few operations
> - new on an anyfied class
> - new on an anyfied array
> - invocation of an anyfield method.
> for all theses operations, the idea is to send the class data (method data) by
> storing them on the stack and have a bytecode that describe them as class
> data/method data.
> We also need to way to get the method data inside the method on stack.
>
> I propose to introduce two new opcodes, dataload and datastore,
> - dataload is constructed with a concatenation of field descriptors as parameter
> (or a method descriptor with no parens and return type) and takes all values on
> stack and store them in a side channel.
> - datastore also takes a concatenation of field descriptors as parameter and
> extract the data from the side channel to the stack.
>
> dataload is used as prefix of anew, anewarray to pass the class data that will
> be used to build the shadow class (if not already created)
> dataload is used as prefix of all invoke* bytecode to pass the method data.
>
> We also need a special reflection method in Thread, getMethodData() that returns
> the method data associated to the current method as an array or null if no
> method data was pass when the method was called.
>
> Note that when invokedynamic is perfix by a dataload, the bootstrap method has
> no access to the data, only the target of the callsite will see the method
> data.
>
>
> To summarize, i propose to implement reified generics in the VM by introducing
> the notion of shadow class, a class only available at runtime that has
> associated class data and a user defined way to do fields and methods
> specialization at runtime. The main advantages of the solution is that old
> classes will not only be able to use anyfied generics but old code will be also
> optimized by JITs as if it was a new code.
>
>
> regards,
> Rémi
More information about the valhalla-spec-experts
mailing list