POC: JDK ClassModel -> ASM ClassReader

Fri Jul 1 13:51:57 UTC 2022

Hello,

Thanks for the initiative, I am looking forward to having this as a public
API at some point! I maintain Byte Buddy, a code generation library that
relies on ASM for processing byte code. To also support the JDK API, I do
want to make the currently used ClassReader/ClassWriter pluggable to use
the JDK API once it is available. This way, Byte Buddy can offer some form
of forward compatibility for example, for Java agents in the future by
using ASM for older ASMs but OpenJDK-based reader/writers for JVMs that do
include these new APIs.

For a POC, I have now implemented a ClassReader that uses a ClassModel to
delegate to ASM’s API. I could validate that ASM and the JDK API generate
the same output for all constellations I could think of:
https://github.com/raphw/asm-jdk-bridge

This works really well, congratulations to such a well-working first
attempt. Some details stuck out however:

1. It does not seem like it is possible to model “CROP” frames. The frames
that are read include zero additional local variables, but they do not
indicate how many local variables are cropped. I would suggest adding
subinterfaces for all frame types to allow for the same pattern matching
style that works well with the rest of the API. This way, the
declaredLocals and declaredStack methods would only be available if
relevant and the crop frame type could add a croppedLocals() : int method.

2. StackMapFrame uses offsetDelta and absoluteOffset to indicate the
frame's location. I found that a bit awkward as I need to keep track of the
offset just to add frames at the right location. With all other types,
Labels are used to indicate the code location. Why are labels not used for
frames to keep things consistent? Also, I didn't really understand the
purpose of “initialFrame”. Is it a mere convenience?

3. Debugging with toString works well for the most, but not all classes,
for example subclasses of CodeElement have representations. It’s probably
an oversight but it would be neat to add this quickly to make exploring the
code easier.

4. The JDK code knows an attribute named CharacterRange. I must confess
that I never heard of it and I could neither find documentation. I assume
it is a JDK20 concept? This made me however think about how such “unknown”
attributes can be handled. I would like to find a way to treat all
attributes that I do not know or care about as an UnknownAttribute. This
way, I could simply forward them as a binary dump, as I currently do it for
custom attributes, for example to forward it to ASM. However, today there
is no way to convert an attribute payload back to an array.

5. I think the “findAttribute” method will be invoked a lot. Currently, it
iterates over all attributes on each call. Ideally, this would only be done
once to create a map of attributes with O(1) lookup speed. Of course, I
could do this myself, but I think this would be such a common need that I
wanted to suggest it as implementation for the official API, especially
since the API feels like a map lookup. This could be done as easily as by
storing attributes in a map after they are read for the first time as the
attribute keys already are compared by identity.

6. I found that TypeAnnotation.CatchTarget offers an index of the exception
table entry, additionally to being visited inline. I found that model a bit
awkward as there is no indication of the index in the ExceptionCatch
instruction that comes with the same pass. Also, there is no guarantee on
when the type annotations are visited in relation to the try catch block.
Ideally, it would be guaranteed that the annotation is visited directly
after the ExceptionCatch pseudo instruction, to allow for an easy
single-pass processing.

7. There is a SAME_LOCALS_1_STACK_ITEM frame constant with an “EXTENDED”
version where this word is appended. For the SAME constant, the extended
version is called SAME_FRAME_EXTENDED. To keep it consistent, should this
constant be renamed to SAME_EXTENDED? Also, there is a
RESERVED_FOR_FUTURE_USE constant. Should that constant exist even now?

Finally, I have not yet started working on a ClassWriter equivalent. Here,
I found that the style of the Consumer<ClassBuilder> to be incompatible
with the way ASM works. This is of course a decision of style, but I would
consider this a major difficulty to migrate current code at some point.
Would it be an idea to offer both styles of class creation? The internal
one, additionally to exposing an interface with the methods of
DirectClassBuilder?
With some API as:

DirectClassBuilder builder = Classfile.build(ClassDec);
byte[] classFile = builder.toByteArray();

I could easily plug an OpenJDK-based ClassWriter into Byte Buddy’s ASM code.

Today, I can emulate this by building a new class for every instruction I
encounter but this is of course quite inefficient.

Thanks and best regards, Rafael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/classfile-api-dev/attachments/20220701/fdc0a3bc/attachment.htm>