RFR: 8360164: AOT cache creation crashes in ~ThreadTotalCPUTimeClosure() [v3]

Ioi Lam iklam at openjdk.org
Mon Jun 30 06:58:40 UTC 2025


On Mon, 30 Jun 2025 06:31:08 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> I changed the code to call `vm_direct_exit()` instead. At this point, we are still in the middle of `JNI_CreateJavaVM_inner`
>> 
>> 
>> 0 vm_direct_exit 
>> 1 MetaspaceShared::preload_and_dump 
>> 2 Threads::create_vm 
>> 3 JNI_CreateJavaVM_inner 
>> 4 JNI_CreateJavaVM 
>> 5 start_thread 
>> 6 clone3 
>> 
>> 
>> so we may not have finished all the initialization that `before_exit()` may rely on.
>> 
>> Here's the code that's not yet executed:
>> 
>> https://github.com/openjdk/jdk/blob/da7080fffb2389465dc9afca6d02e9085fe15302/src/hotspot/share/prims/jni.cpp#L3591-L3635
>> 
>> 
>> To be honest, I am not sure where is the starting point where it's safe to call `before_exit()` or `System.exit()`. At this point a lot of Java code has been executed (for setting up the module graph, etc), but I am not sure what happens when some of that Java code calls `System.exit()`, or whether such a scenario has been (sufficiently) tested.
>> 
>> BTW, JFR is disabled when we are dumping CDS:
>> 
>> https://github.com/openjdk/jdk/blob/da7080fffb2389465dc9afca6d02e9085fe15302/src/hotspot/share/jfr/recorder/jfrRecorder.cpp#L198-L206
>
>> I am not sure what happens when some of that Java code calls System.exit(), or whether such a scenario has been (sufficiently) tested.
> 
> Initialization Java code should not be calling System.exit - ever. If something goes wrong it should throw an exception which is seen by the main thread doing the VM init and that in turn will lead to `vm_exit_during_initialization`.
> 
> To me the simplest model to use here would be to act as-if AOT creation was done by a small Java class such that you know the VM is fully initialized when you create the cache and then you can do a normal Java-level termination afterwards.

I tried doing that, but it's not straight-forward. At this point, some internal VM states have been zero-ed out by the AOT cache dumping code. As a result, it's no longer possible to resolve a new class. We get a crash when System.exit() resolves the Logger class:


[....]
V  [libjvm.so+0xf27aa2]  InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)+0x116  (interpreterRuntime.cpp:1001)
j  java.lang.System.getLogger(Ljava/lang/String;)Ljava/lang/System$Logger;+28 java.base at 26-internal
j  java.lang.Shutdown.logRuntimeExit(I)V+2 java.base at 26-internal
j  java.lang.Shutdown.exit(I)V+1 java.base at 26-internal
j  java.lang.Runtime.exit(I)V+1 java.base at 26-internal
j  java.lang.System.exit(I)V+4 java.base at 26-internal
v  ~StubRoutines::call_stub 0x00007f392ae6072e


Since we need to backport this fix to JDK 25, I think we should go with the `vm_direct_exit()` now, and perhaps fix the AOT dumping code to avoid zeroing out VM states in a follow-up RFE.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/26008#discussion_r2174357577


More information about the hotspot-runtime-dev mailing list