[External] : Re: POC: JDK ClassModel -> ASM ClassReader
Adam Sotona
adam.sotona at oracle.com
Wed Jul 13 13:17:00 UTC 2022
On 13.07.2022 13:49, "Rafael Winterhalter" <rafael.wth at gmail.com> wrote:
ASM offers to override a method in ClassWriter to compute frames without loading classes. Byte Buddy does so, and it is possible to generate stack map frames this way. But it is still quite an overhead and just as with OpenJDK right now, it does not work when types are missing what is unfortunately rather common, for example if Spring is involved. In contrast, Byte Buddy's frame weaving is always single-pass and does not require any allocation (per frame), and works with missing types. (Currently, OpenJDK produces a verification error https://github.com/raphw/asm-jdk-bridge/blob/writer-poc/src/test/java/codes/rafael/asmjdkbridge/sample/FrameWithMissingType.java<https://urldefense.com/v3/__https:/github.com/raphw/asm-jdk-bridge/blob/writer-poc/src/test/java/codes/rafael/asmjdkbridge/sample/FrameWithMissingType.java__;!!ACWV5N9M2RV99hQ!P7dgFjGSUa8LRpucquOmAlRGEEtLaI1HT7magyDRppDtnF_JI6VY4NOKjQ17zq3TH-dib0AbJPVElA5JA6Bv$>). This is why I would prefer to plug this logic into OpenJDK's class writer, if possible. Of course, this logic is based on some assumptions, but if you are not trying to cover the general case, one can always be more efficient than a generic processor, just for that it would be a helpful addition.
Could you be, please, more specific about “OpenJDK produces a verification error”. I don’t see any code using Classfile API in your sample.
Classfile API is passing unchanged stack maps of unchanged methods through transformations (in default shared constant pool mode).
What I agree with is API extension to allow manually pass stack maps also through builder. However I’m not sure what else do you mean by “frame weaving”.
As for the generator API: The current StackMapFrame objects contain information that is not present in the class file. For example, the attribute offers an "initialFrame", and the frames themselves contain effective stack and effective locals. For crop frames, those currently contain the types of the cropped frames. For a writer API, OpenJDK's API would hopefully only consume the data as it is written to the class file. That would be (a) the type of frame and (b) the declared values for stack and locals, or the amount of cropped frames. Of course, one could link all frames within a StackMapTableAttribute together and compute this information on the fly. For example by providing some builder:
StackMapFrameAttribute b = StackMapFrameAttribute.builder(MethodDesc)
.append(...)
.crop(...)
.same(...)
.same1(...)
.full(...)
.build();
One could then add this attribute or fail the CodeBuilder if "manual mode" was set without the attribute being present. I think this later option would be a decent API. If you would consider it, I can offer to prototype such a solution.
What you are asking for is compressed form of stack map table. Initial stack map frame is a key information for anyone working with stack maps. Having only relative offsets and differentially compressed subsequent frames is useless information without the initial frame. According to my experience only effective full stack map frames can be transformed or any other way processed. Information about compressed form of every frame is just a secondary. Every stack map table can be compressed into many equivalent forms.
I agree with an option for user to specificy labeled full stack map frames.
However frames compression is something user should not be responsible of, it is similar as to request of manual deflation when writing a zip file.
Thanks,
Adam
Best regards, Rafael
Am Di., 12. Juli 2022 um 14:20 Uhr schrieb Adam Sotona <adam.sotona at oracle.com<mailto:adam.sotona at oracle.com>>:
From: classfile-api-dev <classfile-api-dev-retn at openjdk.org<mailto:classfile-api-dev-retn at openjdk.org>> on behalf of Brian Goetz <brian.goetz at oracle.com<mailto:brian.goetz at oracle.com>>
Date: Sunday, 10 July 2022 19:40
To: Rafael Winterhalter <rafael.wth at gmail.com<mailto:rafael.wth at gmail.com>>
Cc: classfile-api-dev at openjdk.org<mailto:classfile-api-dev at openjdk.org> <classfile-api-dev at openjdk.org<mailto:classfile-api-dev at openjdk.org>>
Subject: Re: POC: JDK ClassModel -> ASM ClassReader
> StackMapFrames could on the other hand just be added at their position in the CodeElement iteration to receive them where they become relevant. This way, one would not need to keep track of the current offset. This would also allow for an easier write model where ASM does not allow you to know the offset of a stack map. I assume that the current model is very much modeled after the needs of the javap tool. Ideally, the frame objects would be reduced to the information that is contained in a class file and the consumer could track implicit information such as the "effective" stack and locals.
There are two concerns with this, one minor and one major. The minor one is that this has a significant cost, and most users don’t want this information. So we would surely want to gate this with an option whose default is false. (We care the most about transformation, and most transformations make only light changes, so we don’t want to add costs that most users won’t want to bear.).
The major one is how it perturbs the model. The element stream delivered by a model, and the one consumed by a builder, should be duals. What should a builder do when handed a frame? Switch responsibility over to the user for correct frame generation? Ignore it and regenerate stack maps anyway?
I think we need a better picture of “who is the audience for this sub feature” before designing the API.
I’ve been considering (and re-considering) many various scenarios related to stack maps.
Number one requirement is to generate valid stack maps in any circumstances and with minimal performance penalty. And we already do a lot in this area:
· Stack maps generation requires minimal information about involved classes and only to resolve controversial situations (for example when looking for common parent of dual assignment to a single local variable). Required information is minimal and limited to individual classes (is this specific class an interface or what is its parent class). On the other side for example ASM requires to load the classes with all dependencies to generate stack maps.
· Generation process is fast and does not produce any temporary structures and objects. It is single-pass in >95% cases, dual-pass in >4% cases and maximum three-pass in the remaining ~1% of cases (statistics calculated from very large corpus of classes).
Experiments to involve transformed original stack maps led to increased complexity, worse performance, and mainly failed to produce valid stack maps. There is no benefit of passing user-created (or somehow transformed) stack map frames to the generator for final processing.
>From the discussion (and from my experience with class instrumentation) I see one use case we didn’t cover and one case where we can improve:
1. We handle well class transformation from single source. Shared constant pool allows to keep original stack maps for all methods with unmodified bytecode. However, class instrumentation is a transformation with at least two sources. Such transformation can share only one constant pool. All methods from the second source must be exploded to instructions, reconstructed and stack maps generated from scratch. Author of such transformation must be fully aware of the consequences and having an option to pass stack maps through non-shared CP transformation would be a valuable feature. It would require:
a. Option to individually turn off stack map generation per method (because there might be also synthetic methods where sm generation is required). I would propose to implement Code-level override of global options.
b. Factory for stack map table manual construction (based on labels, not offsets). I would propose to put this “manual mode” aside from CodeBuilder and implement it as StackMapTableAttribute.of(Frame…) factory.
2. There are cases where required class hierarchy is not available (when there is no access to all jars), so it would be hard for user to provide appropriate ClassHierarchyResolver. However, many individual class information can be theoretically extracted from the source classes (from class headers, from existing stack maps, from other attributes or from the bytecode itself). It is just an idea; however I think it might be possible to implement an optional hierarchy resolver, that will learn from the parsed classes. It is a theoretical option for improvement, without throwing that responsibility on user. However, any such solution would remain in category of non-deterministic. Altering a stack map without minimal knowledge of the involved classes is still a blind surgery.
Thanks,
Adam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/classfile-api-dev/attachments/20220713/a33fa502/attachment-0001.htm>
More information about the classfile-api-dev
mailing list