Record classfile and runtime API

Sun Sep 1 15:27:13 UTC 2019

Hi all,
i think the current generated classfile bytecodes can be improved.

For a record Point(int x, int y) { }

The classfile contains:
  - a specific class attribute Record
    Record:
     int x;
       descriptor: I
     int y;
       descriptor: I
    which allows compiler separate compilation.
  - two bootstrap methods
    - one for the invodynamic calls used inside toString/equals/hashCode
      this bootstrap method takes 4 arguments
        #8 fr/umlv/record/Point
        #51 x;y
        #53 REF_getField fr/umlv/record/Point.x:I
        #54 REF_getField fr/umlv/record/Point.y:I
    - one for the constant dynamic used inside the pattern extractor/handler
      this bootstrap method takes 3 arguments
        #8 fr/umlv/record/Point
        #53 REF_getField fr/umlv/record/Point.x:I
        #54 REF_getField fr/umlv/record/Point.y:I

so we have 3 different ways to represent exactly the same thing, a record Point is composed of two component int x and int y, i believe we can do better.

The arguments of the bootstrap methods are redundant, you don't need to pass the record class because the Lookup object contains the record class (you have the API in Lookup to create an intermediary lookup from a Lookup and a class so no need to duplicate that information). For the first boostrap methods, the parameter containing "x;y" is due to the fact that a constant method handle is opaque by default so you can not get the corresponding field name. First, if you have the lookup that have created that object, you can get back the name with a "reveal". But i think here, the problem is that the arguments are the constant method handles and not the constant decriptor (ConstantDec).

The java.lang.invoke API has currently the limitation that you can not use the ConstantDesc API to type values of the constant pool.

How can we do better ?
We only need the attribute Record, all other informations can be retrieved at runtime from the attribute Record.

Now we have to talk about the reflection API, currently the method that reflects the attribute Record is Class.getRecordAccessors() that returns an array of Method corresponding to the accessors and there are many things wrong with this method.
First, while returning the accessors may be a good idea, it should not be the sole API, we should have an API that reflects the data of the attribute Record itself. Then it returns Method so a live object that may have triggered the classloading of the types of each record component that may notexist at runtime, we have introduce the constable API, we should using it here. And the implementation has several issues, first it can throws a NoSuchMethodException which is a checked exception, then it uses getDeclaredMethod which returns the methods with the most precise return type which is not what you want here.
Here is an example that show why it's buggy, le suppose i have an interface and a record
  interface I {
    default Object x() { return 0; }
  }
  record R(Object x) implements I { }
now let say i change I without recompiling R
  interface I {
    default String x() { return 0; }
  }
R.class.getRecordAccessors() will return the method x() that returns a String instead of the one that returns an Object.

So first, let's have a method getRecordComponentFields() in java.lang.Class that returns an array of DirectMethodHandleDesc corresponding to method ref to the fields of a Record. It uses a constant desc, so no weird classloading will appear here. And it's better than an array of MethodHandle because a DirectMethodHandleDesc is not an opaque object.

With that, we can unify the bootstrap methods to have only one bootstrap method with zero argument, the one that return a pattern/extractor can be implemented into another class but can be queried from the same bsm that the one that is needed for toString/equals/hashCode.

BTW, why do we need an extractor for a record ? we should not need one because we can inside the algorithm that create the pattern tree have a if class.isRecord() ? (the same way the serailization is specilized for enums by example).

On problem with this approach is that while it reduces the bytecode size, it will also resolve the constant method handles to the accessors several times but i think it's better to have this kind of cache inside java.lang.Class than to introduce a constant dynamic that will return a constant list of constant method handles instead. Introducing something in the constant pool for caching reason seems not a good idea if in the future we want to change the way it works.

Rémi