RFR: 8353225: Add a way to iterate Klass inside the loaded CDS archive

Tue Apr 1 07:08:29 UTC 2025

On Tue, 1 Apr 2025 06:35:13 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> I will call this after the CDS archive has been mapped, at the same time as PrintSharedArchiveAndExit would have been called. But I would like this function to work at any time. 
>> 
>> But above all, it is very paramount that this technique finds me all classes that are in this archive and not loaded via the normal dynamic class loading path. I also need this method to work reliably also with any improvements ongoing (e.g. JEP 483). Is there a better way?
>> 
>> Looking into the code, I see that this method (scanning classes via Klass::next_link) is also used to implement LambdaProxyClassArchive.find()/LambdaProxyClassDictionary::find_lambda_proxy_class. Which uses the linked list precomputed with AdjustLambdaProxyClassInfo. It uses Klass::next_link, but that is used by the CLDG to tie classes of the same loader together. How does this work? Would LambdaProxyClassArchive.find() (and thus LambdaMetaFactory.metafactory()) not have the same problem?
>
> Once the lambda proxy class is loaded, it's removed from the LambdaProxyClassDictionary, so we won't find it (and try to load it again). This is intended.
> 
> Did you file an RFE for that explains what you intend to do with such an iterator in the future? We should understand the requirements before making this change.

No, not yet, though I have talked with many people about this, to positive feedback. I will put this in writing soon.

The gist of this is: I have a prototype called "KLUT" (KlassLookupTable), which pre-computes a 32-bit token per Klass and puts that into a lookup table with the narrowKlass as index (remember, for COH, narrowKlass is really an index, 1 2 3 4 5).

That token contains a condensed subset of information from Klass. The token is then used to get type information for oop iteration in GC. That avoids having to access Klass during GC, in fact, we don't even need to decode the nKlass. 

All of that drastically reduces cache misses during GC, since we don't have to pick type information from a very sparse Klass but load a 32-bit value from a very condensed table. It practically takes out type information loading as a source of cache misses from the equation.

To get the fullest benefit from this optimization, I pre-generate these tokens on Class loading. I could calculate them on demand, too, but that would add another branch into the hot path of oop iteration. I would rather avoid that.

So, I need to ensure these tokens are generated for every Klass for which I will find objects in the heap. Hence this PR.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24311#discussion_r2022267672