Support custom class loaders in AOT cache
Charles Oliver Nutter
headius at headius.com
Thu Jan 29 07:24:43 UTC 2026
Oh boy do I have a test case for you! 😄
JRuby uses custom class loaders for gobs of stuff, and frequently a single
class per class loader solely to make them garbage collectible. In the
future I intend to integrate anonymous class loading, but for backward
compatibility we are still somewhat limited.
A few examples:
* Ruby scripts compiled at load in certain modes of execution. The classes
are named based on a mangled version of the source file system path and
should be identical across runs.
* Individual Ruby methods each JIT compiled into their own classes at
runtime. These also should be the same across runs, but I would have to
double check that they are uniquely named. Up to now, the name of the class
has been unimportant to us so there's a fair chance of collisions. Easily
fixed.
* Custom object shapes generated at runtime to allow heap inspection that
shows actual Ruby field names rather than our generic pre-generated shape
field names (var0, var1, etc). Ruby's fields are discovered at runtime, so
we tend to evolve the object shapes as more fields are discovered. Usually
users will just run with the generic shapes which are pre-generated at
build time.
* Generated interface implementations and concrete subclasses for bridging
Ruby and Java. These are definitely NOT uniquely named, but we have an
outstanding bug to name and cache them better.
* LambdaForms. Oh the humanity.
In past experiments with other JVMs like those from Azul (Zing) or IBM (J9)
We were told that naming our custom class loaders would help, but I never
explored that dimension.
It's also probably worth mentioning that of these scenarios, only the
interface implementations and concrete subclasses will always be generated
at boot, before they are used the first time, with the others requiring
varying levels of warm up or specialized JRuby configurations. As the
application warms up, you would first see JIT compiled Ruby methods
followed by the LambdaForms they generate en masse.
I'd be happy to have a chat about some interesting use cases for you to
try, but they are probably the same use cases we've discussed in the past.
We are excited to see where this goes!
On Thu, Jan 29, 2026, 00:58 Ashutosh Mehra <asmehra at redhat.com> wrote:
> Hi Ioi,
> Thanks for sharing your thoughts.
> I have a hacky incomplete implementation for supporting custom loaders
> [0]. It is largely work-in-progress at the moment. It does share many of
> the ideas that you mentioned.
> Instead of loader.setAOTCompatible(), it uses
> loader.setAOTIdentity(String).
> I also updated the code to use CDS$UnregisteredClassLoader to load the
> classes during the assembly phase (based on the discussion in the leyden
> meeting last week).
> But with this approach I hit an assert when the JVM tries to archive the
> java mirror of such classes. As these classes are loaded by
> CDS$UnregisteredClassLoader, their java mirror has a reference to Module
> object that in turn has reference to the instance of
> CDS$UnregisteredClassLoader, but this class is explicitly marked for
> exclusion. So we hit this assert in assembly phase:
>
> #16 0x00007f2c89244fdd in report_vm_error (file=0x7f2c8a748510
> "leyden/src/hotspot/share/cds/aotArtifactFinder.cpp", line=222,
> error_msg=0x7f2c8a748640
> "assert(!SystemDictionaryShared::should_be_excluded(k)) failed",
> detail_fmt=0x7f2c8a7485c0 "precond") at
> leyden/src/hotspot/share/utilities/debug.cpp:196
> #17 0x00007f2c88d00b90 in AOTArtifactFinder::append_to_all_cached_classes
> (k=0x18002200) at leyden/src/hotspot/share/cds/aotArtifactFinder.cpp:222
> #18 0x00007f2c88d00c2a in AOTArtifactFinder::add_cached_instance_class
> (ik=0x18002200) at leyden/src/hotspot/share/cds/aotArtifactFinder.cpp:236
> #19 0x00007f2c88d00a15 in AOTArtifactFinder::add_aot_inited_class
> (ik=0x18002200) at leyden/src/hotspot/share/cds/aotArtifactFinder.cpp:197
> #20 0x00007f2c8961643b in HeapShared::archive_object (obj=0x8088e1b8,
> referrer=0x8088e210, subgraph_info=0x7f2aec029308) at
> leyden/src/hotspot/share/cds/heapShared.cpp:675
> #21 0x00007f2c8961ba50 in HeapShared::walk_one_object
> (stack=0x7f2c55bfa2c0, level=3, subgraph_info=0x7f2aec029308,
> orig_obj=0x8088e1b8, referrer=0x8088e210) at
> leyden/src/hotspot/share/cds/heapShared.cpp:2006
> #22 0x00007f2c8961b28c in HeapShared::archive_reachable_objects_from
> (level=1, subgraph_info=0x7f2aec029308, orig_obj=0x80a64eb0) at
> leyden/src/hotspot/share/cds/heapShared.cpp:1914
> #23 0x00007f2c89618012 in HeapShared::scan_java_mirror
> (orig_mirror=0x80890ee8) at leyden/src/hotspot/share/cds/heapShared.cpp:1066
> #24 0x00007f2c89618135 in HeapShared::scan_java_class (orig_k=0x169dc710)
> at leyden/src/hotspot/share/cds/heapShared.cpp:107
>
> To workaround I skip archiving the java mirror of these classes.
>
> Other known issues with the current state of the prototype:
> 1. Restoring the protection domain in the production run. This is
> currently not implemented.
> 2. Call to AOTLinkedClassBulkLoader::link_classes_for_loader() hits an
> assert due to lock ordering issue
>
> I have only tested it with an app with a single custom loader that loads a
> bunch of classes.
> I am sure I will encounter more issues as I test it with more complex
> applications.
> I will look into your patch to store java mirrors in the AOT config file
> and adopt it in my prototype. It should resolve the issues around archiving
> of java mirror and protection domain.
>
> Any comments/feedback on the prototype is welcome.
>
> [0] https://github.com/ashu-mehra/leyden/tree/custom-loader-support-v2
>
> Thanks,
> - Ashutosh Mehra
>
>
> On Wed, Jan 28, 2026 at 5:01 PM <ioi.lam at oracle.com> wrote:
>
>> We have been brainstorming about supporting custom class loaders in the
>> AOT cache. While the design is far from final, in our small group
>> discussions, we seem to be converging on this:
>>
>> - Only custom class loaders that are known to produce *stable* results
>> can store classes in the AOT cache
>> - Stable result is roughly: when given a class name X,
>> loader.loadClass(X) will always return a class with the same shape
>> - Also, loader.loadClass(X) should not produce any observable side
>> effects, other than the fact that class X has been loaded. E.g., don't
>> set any static fields inside loadClass:
>> - It's completely up to the custom class loader to decide whether it
>> meets the AOT cache requirement.
>>
>> Some examples:
>> - A URLClassLoader that loads from a fixed set of JAR files in the local
>> file system that are known to never change
>> - A code generator that always generates the same code shape given the
>> same class name
>>
>> A counter example:
>> - A code generator that mixes code with a random seed
>>
>> The handshake between the class loader and the AOT cache might look like
>> this:
>>
>> URL[] urls = new URL[] {"foo.jar", "bar.jar"};
>> URLClassLoader loader = new URLClassLoader(urls);
>> String UID = "URLClassLoader$foo.jar:" + cksum("foo.jar") +
>> "$bar.jar:" + cksum("bar.jar");
>> loader.setAOTCompatible(UID);
>> loader.loadClass("com.foo.Foo");
>> loader.loadClass("com.bar.Bar");
>>
>> In the training run, the JVM will store all classes loaded by this
>> loader into the AOT cache. These classes are tagged with the given UID.
>>
>> In the production run, when setAOTCompatible(UID) is called, the JVM
>> checks if the AOT cache has any classes tagged with the UID. If so,
>> these classes are automatically loaded into the loader *without any
>> observable side effect*. Note that the usual handshake of
>> ClassLoader::{findClass, loadClass, defineClass}, etc, does not happen.
>> The classes simply appeared in the loader out of thin air.
>>
>> The UID provides a way for the loader to identify itself, as well as
>> encoding the dependencies that were assumed during the training run. In
>> the above example, we use the checksum of each JAR file to make sure
>> that these files haven't changed (or disappeared).
>>
>> Note that we don't actually cache the loader object itself. The loader
>> object will probably have references to environment states that cannot
>> be safely stored into the AOT cache. Also, the creation of the loader
>> during the training might produce side effects that cannot be easily
>> captured into the AOT cache.
>>
>> We will likely have some restrictions on the behavior of the "AOT
>> compatible" loaders
>>
>> - loader.setAOTCompatible() must be called before any class is defined
>> in this loader. Otherwise setAOTCompatible() will throw an
>> IllegalStateException
>> - Only classes with simple ProtectionDomains will be stored into the AOT
>> cache. For example, if the loader defines a class with a
>> ProtectionDomain that uses a signed code source, the class will be
>> excluded from the cache.
>>
>> Some implementation details:
>>
>> Ashutosh is working on a prototype. I think we can store the classes
>> into the AOT configuration file at the end of the training run:
>>
>> - Store the Java mirror of the class into the AOT configuration file
>> (this requires https://github.com/openjdk/jdk/pull/29472 )
>> - Also save the ProtectionDomain in the mirror
>>
>> In the assembly phase, load the classes of each UID into an instance of
>> jdk.internal.misc.CDS$UnregisteredClassLoader. This way we can handle
>> classes of the same name defined in two different UIDs.
>> If two UIDs have a parent/child relationship, we should recreate that
>> with the UnregisteredClassLoader. This is needed for constant pool
>> pre-linking..
>>
>> The above are just my random notes. Please add your thoughts.
>>
>> Thanks
>> - Ioi
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20260129/19d1f0d1/attachment.htm>
More information about the leyden-dev
mailing list