POC: JDK ClassModel -> ASM ClassReader

Tue Jul 12 12:20:40 UTC 2022

From: classfile-api-dev <classfile-api-dev-retn at openjdk.org> on behalf of Brian Goetz <brian.goetz at oracle.com>
Date: Sunday, 10 July 2022 19:40
To: Rafael Winterhalter <rafael.wth at gmail.com>
Cc: classfile-api-dev at openjdk.org <classfile-api-dev at openjdk.org>
Subject: Re: POC: JDK ClassModel -> ASM ClassReader

> StackMapFrames could on the other hand just be added at their position in the CodeElement iteration to receive them where they become relevant. This way, one would not need to keep track of the current offset. This would also allow for an easier write model where ASM does not allow you to know the offset of a stack map. I assume that the current model is very much modeled after the needs of the javap tool. Ideally, the frame objects would be reduced to the information that is contained in a class file and the consumer could track implicit information such as the "effective" stack and locals.

There are two concerns with this, one minor and one major.  The minor one is that this has a significant cost, and most users don’t want this information.  So we would surely want to gate this with an option whose default is false.  (We care the most about transformation, and most transformations make only light changes, so we don’t want to add costs that most users won’t want to bear.).

The major one is how it perturbs the model.  The element stream delivered by a model, and the one consumed by a builder, should be duals.  What should a builder do when handed a frame?  Switch responsibility over to the user for correct frame generation?  Ignore it and regenerate stack maps anyway?

I think we need a better picture of “who is the audience for this sub feature” before designing the API.

I’ve been considering (and re-considering) many various scenarios related to stack maps.
Number one requirement is to generate valid stack maps in any circumstances and with minimal performance penalty. And we already do a lot in this area:

  *   Stack maps generation requires minimal information about involved classes and only to resolve controversial situations (for example when looking for common parent of dual assignment to a single local variable). Required information is minimal and limited to individual classes  (is this specific class an interface or what is its parent class). On the other side for example ASM requires to load the classes with all dependencies to generate stack maps.
  *   Generation process is fast and does not produce any temporary structures and objects. It is single-pass in >95% cases, dual-pass in >4% cases and maximum three-pass in the remaining ~1% of cases (statistics calculated from very large corpus of classes).
Experiments to involve transformed original stack maps led to increased complexity, worse performance, and mainly failed to produce valid stack maps. There is no benefit of passing user-created (or somehow transformed) stack map frames to the generator for final processing.
>From the discussion (and from my experience with class instrumentation) I see one use case we didn’t cover and one case where we can improve:

  1.  We handle well class transformation from single source. Shared constant pool allows to keep original stack maps for all methods with unmodified bytecode. However, class instrumentation is a transformation with at least two sources. Such transformation can share only one constant pool. All methods from the second source must be exploded to instructions, reconstructed and stack maps generated from scratch. Author of such transformation must be fully aware of the consequences and having an option to pass stack maps through non-shared CP transformation would be a valuable feature. It would require:

     *   Option to individually turn off stack map generation per method (because there might be also synthetic methods where sm generation is required). I would propose to implement Code-level override of global options.
     *   Factory for stack map table manual construction (based on labels, not offsets). I would propose to put this “manual mode” aside from CodeBuilder and implement it as StackMapTableAttribute.of(Frame…) factory.

  1.  There are cases where required class hierarchy is not available (when there is no access to all jars), so it would be hard for user to provide appropriate ClassHierarchyResolver. However, many individual class information can be theoretically extracted from the source classes (from class headers, from existing stack maps, from other attributes or from the bytecode itself). It is just an idea; however I think it might be possible to implement an optional hierarchy resolver, that will learn from the parsed classes. It is a theoretical option for improvement, without throwing that responsibility on user. However, any such solution would remain in category of non-deterministic. Altering a stack map without minimal knowledge of the involved classes is still a blind surgery.

Thanks,
Adam

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/classfile-api-dev/attachments/20220712/8331ef2b/attachment.htm>