[External] : Re: ClassHierarchyResolver using Reflection information

Brian Goetz brian.goetz at oracle.com
Mon Mar 20 18:19:45 UTC 2023


The CHR interface is straightforward enough -- given a symbolic 
reference to a class, provide two bits of information: superclass and 
is-interface.

We have at least four sensible implementations, which can be combined:
  - do nothing
  - work from an explicit static list (suitable for compilers who are 
generating classes that do not yet exist)
  - Extract `.class` resources from the running JDK
  - Extract live Class objects from the running JDK

These go in increasing order of work done and environmental dependency; 
earlier items will be faster and more reproducible; latter items will be 
more usable.  The CHR interface also supports chaining with `orElse`, so 
that we can start with one strategy and fall back to another.

Separately, there is caching, which should be orthogonal but which is 
built into the default resolver.  There's a public 
CachedClassHierarchyResolver wrapper (probably should be private) which 
delegates to another CHR and memoizes the result in a synchronized 
HashMap.  (There's a static factory ofCached which delegates to this, 
but it is unused.)

More importantly, the default adopts the cache strategy of "cache lives 
forever / shared among all clients of the API."  I get that we want to 
avoid re-resolving these classes every time, but if we're going to 
include class loading as a way to put elements in the cache, there is a 
risk that it becomes polluted with "private" stale data (and also 
retains memory forever.)

So things I would like to fix include:

  - The basic implementations are not in a form that makes them easily 
reusable;
  - It is not obvious exactly what the default resolver does (and the 
name may be wrong), or when the user would want to select an alternate 
strategy, or what its caching strategy is, but it should definitely be 
defined in terms of building blocks the user understands;
  - The caching strategy of the default is questionable and probably 
needs more configuration.

We currently define the default as:

     new CachedCHR( cd -> 
ClassLoader.getSystemResourceAsStream(Util.toInternalName(classDesc) + 
".class"))

and it is being proposed that it be redefined as that, plus falling back 
to class loading if that fails.

What I'd like users to come away with is:

  - There's a menu of basic options, which can be combined;
  - There's a way to wrap the basic options with caching, and ideally 
the user can have some control over sharing / lifetime / caching strategy;
  - There's a default, which is defined strictly in terms of the menu 
above, which they can reason about.

We can start with exposing the basics.  We already have 
CHR.of(interfaces, supersMap).  We need factories or constants for "do 
nothing", "system resources from classloader X", "live classes from 
classloader X", etc.

We can refine caching so that CachedCHR is a private implementation, and 
all public API is funneled through ofCached.  It is possible we will 
want to expose some more control over the cache, such as wrapping the 
whole cache with a SoftRef, or allowing the user to specify the backing 
map; this is one more factory.

I think we should rename DEFAULT_CHR to something like GLOBAL_CHR to 
reflect the fact that it comes with a long-lived, JVM-wide cache.  
GLOBAL_CHR can be specified to be CachedCHR.of(systemResourcesCHR()) or 
CachedCHR.of(systemResourcesChr().orElse(systemClassloaderCHR())).

We can _then_ define DEFAULT_CHR to point to GLOBAL_CHR, and people who 
want a non-shared cache can substitute one.

Smaller observation: we have shied away from using public records in the 
API, instead having a public interface and a private record-backed 
implementation.  ClassHierarchyInfo is currently a public record.


Stepping back, the global cache suggests that we might want to refactor 
the front-door entry point to the API.  Right now, we have a pile of 
static methods in Classfile (build/parse), many of which take sets of 
options.  The point of the global cache is to avoid re-resolving 
information every time, but the global granularity is questionable -- it 
seems much more likely that an agent would want to create a static 
per-agent configuration, shared across uses of that agent, but not 
shared with other agents.  This suggests that perhaps the methods on 
Classfile really should be instance methods on an object that 
encapsulates the option set as well as accumulating cache state.  (And 
if the agent is unloaded, all the cache goes with it.)

Something like:

     public class ClassfileReaderWriterThingie {
         public ClassfileReaderWriterThingie(Option... options) { ... }

         public ClassModel parse(byte[] bs) { ... }
         public ClassModel parse(Path p) { ... }

         public byte[] build(ClassDesc thisClass, Consumer<CB> handler) 
{ ... }
         // drop overload taking options, since its in the ctor
         public byte[] build(ClassEntry thisClassEntry, CPB cpBuilder, 
Consumer<CB> handler) { ... }

         // buildTo and buildModule overloads
     }

This is a somewhat more intrusive API change (though not too much), so 
my suggestion is to proceed on the above first and then revisit this one.

(One point being that if we're going to have a long-lived cache, it 
should probably be encapsulated in an object the user can control.)


On 3/20/2023 12:51 PM, Brian Goetz wrote:
> After looking over this corner of the API, I think there's a little 
> more design work that needs to go into it.  So let's step back from 
> the details of "what's the default" until we have a chance to look a 
> little more holistically at this part of the API.  I'll try to write 
> more about it soon.
>
> On 3/20/2023 11:47 AM, - wrote:
>> If the choice of default resolver is still subject to debate, I can
>> remove the delegation to system class loader based resolver in
>> https://urldefense.com/v3/__https://github.com/openjdk/jdk/pull/13082__;!!ACWV5N9M2RV99hQ!K7UWlrg2NHotSKOzMvmfnenKZ1dDuOZ9zXcQvlKOdB21VvwTRIDuI2NyRNK6Hq844XIkt_6Q-qFt5DsFDUzgKJ31$  , and we can change it in
>> another patch once we reach a consensus. The addition of lookup and
>> class loader based resolvers can be used by users who explicitly
>> specify a resolver in the options.
>>
>> On Mon, Mar 20, 2023 at 9:00 AM Adam Sotona<adam.sotona at oracle.com>  wrote:
>>> New behavior is “no assumptions” and throw IllegalArgumentException if information about type necessary to build stack map (or verify stack map) is not obtained from the resolver (or better say chain of resolvers).
>>>
>>>
>>>
>>> If user provides no resolver – the default is used.
>>>
>>>
>>>
>>> The way how default resolver gets the information from system classloader is subject of discussion below (resource parsing and fallback to class loading).
>>>
>>> Custom resolver can be constructed as a fallback chain from all provided resolver types (including the default resolver instance).
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 20.03.2023 14:25, "Brian Goetz"<brian.goetz at oracle.com>  wrote:
>>>
>>>
>>> If the user provides no resolver at all, what is the new behavior?  Do we just assume Object is the common supertype for any pair of classes?
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/classfile-api-dev/attachments/20230320/b558ddfc/attachment.htm>


More information about the classfile-api-dev mailing list