RFR: 8353225: Add a way to iterate Klass inside the loaded CDS archive

Wed Apr 2 15:49:00 UTC 2025

On Wed, 2 Apr 2025 12:16:10 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> KLUT changes some of the basic operations on oops; for example, oop->isObjArray() retrieves the class kind from the KLUT table, not the class. For that, the entry for that Klass must have been added to the KLUT table. For dynamic class loading, this is not a problem: before an oop is ever born, we need to load the class first, so there is always a path where the Klass is created, and that is where I populate the entry in the table.
>> 
>> For CDS, this does not work. CDS uses methods on oops in the mapped heap regions before ever loading the associated class. For example, by calling oop->isObjArray() or similar. Traditionally this works because oop->isObjArray does not need much, just narrow Klass pointer decoding, which we set up beforehand. For KLUT, it does not work since the KLUT entry is not there yet.
>> 
>> Example: in ArchiveHeapLoader::finish_initialization()  (archiveHeapLoader.cpp:396). There are other examples.
>> 
>> I can workaround this by reverting to dynamic initialization for the KLUT entry (if entry missing, calculate it and store it), but that is undesirable for real oop iteration during GCs. We execute that condition a bazillion times because a few classes need it on the CDS initialization path. Well, maybe branch prediction would be smart enough to minimize the negative effect of this additional branch. But it is somewhat baffling that such a simple thing as iterating all Klass structures in a CDS archive is so complex.
>> 
>> Could one just iterate through the shared class space portion? Hop from Klass to Klass, as one would iterate through the heap? But I doubt that this always works either - there may be holes between Klass structures, or there may other things that live in there, too. 
>> 
>> About the "may never be loaded": That is fine. I rather unnecessarily calculate the KLUT token for a class that will never be used than to pay for "Is this entry initialized?" branch in every use of the KLUT table. The latter is worse, performance-wise.
>
>> About the "may never be loaded": That is fine. I rather unnecessarily calculate the KLUT token for a class that will never be used than to pay for "Is this entry initialized?" branch in every use of the KLUT table. The latter is worse, performance-wise.
> 
> Plus, the token is stored in the Klass, so it will have been calculated at dumptime, and so the only thing missing is copying the entry to its table position at runtime.

There's only a small number of classes that have allocated instances in the CDS archive. I think it's better to build a table of these classes and create KLUT information for them.

BTW, none of the classes in the `LambdaProxyClassDictionary` have allocated instances.

$ LOG=-Xlog:cds+map=trace,cds+map+oops=trace:file=cds.oops.txt:none:filesize=0
$ java -Xshare:dump $LOG
$ grep '@@ Object' cds.oops.txt | cut -b 44- | sed -e 's/ .*//g' | sort | uniq | wc
     79      79    2456
$ java -Xshare:dump $LOG -XX:+AOTClassLinking 
$ grep '@@ Object' cds.oops.txt | cut -b 44- | sed -e 's/ .*//g' | sort | uniq | wc
    172     172    6068

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24311#discussion_r2025121955