RFR: 8360164: AOT cache creation crashes in ~ThreadTotalCPUTimeClosure() [v3]
Ioi Lam
iklam at openjdk.org
Mon Jun 30 06:58:40 UTC 2025
On Mon, 30 Jun 2025 06:31:08 GMT, David Holmes <dholmes at openjdk.org> wrote:
>> I changed the code to call `vm_direct_exit()` instead. At this point, we are still in the middle of `JNI_CreateJavaVM_inner`
>>
>>
>> 0 vm_direct_exit
>> 1 MetaspaceShared::preload_and_dump
>> 2 Threads::create_vm
>> 3 JNI_CreateJavaVM_inner
>> 4 JNI_CreateJavaVM
>> 5 start_thread
>> 6 clone3
>>
>>
>> so we may not have finished all the initialization that `before_exit()` may rely on.
>>
>> Here's the code that's not yet executed:
>>
>> https://github.com/openjdk/jdk/blob/da7080fffb2389465dc9afca6d02e9085fe15302/src/hotspot/share/prims/jni.cpp#L3591-L3635
>>
>>
>> To be honest, I am not sure where is the starting point where it's safe to call `before_exit()` or `System.exit()`. At this point a lot of Java code has been executed (for setting up the module graph, etc), but I am not sure what happens when some of that Java code calls `System.exit()`, or whether such a scenario has been (sufficiently) tested.
>>
>> BTW, JFR is disabled when we are dumping CDS:
>>
>> https://github.com/openjdk/jdk/blob/da7080fffb2389465dc9afca6d02e9085fe15302/src/hotspot/share/jfr/recorder/jfrRecorder.cpp#L198-L206
>
>> I am not sure what happens when some of that Java code calls System.exit(), or whether such a scenario has been (sufficiently) tested.
>
> Initialization Java code should not be calling System.exit - ever. If something goes wrong it should throw an exception which is seen by the main thread doing the VM init and that in turn will lead to `vm_exit_during_initialization`.
>
> To me the simplest model to use here would be to act as-if AOT creation was done by a small Java class such that you know the VM is fully initialized when you create the cache and then you can do a normal Java-level termination afterwards.
I tried doing that, but it's not straight-forward. At this point, some internal VM states have been zero-ed out by the AOT cache dumping code. As a result, it's no longer possible to resolve a new class. We get a crash when System.exit() resolves the Logger class:
[....]
V [libjvm.so+0xf27aa2] InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)+0x116 (interpreterRuntime.cpp:1001)
j java.lang.System.getLogger(Ljava/lang/String;)Ljava/lang/System$Logger;+28 java.base at 26-internal
j java.lang.Shutdown.logRuntimeExit(I)V+2 java.base at 26-internal
j java.lang.Shutdown.exit(I)V+1 java.base at 26-internal
j java.lang.Runtime.exit(I)V+1 java.base at 26-internal
j java.lang.System.exit(I)V+4 java.base at 26-internal
v ~StubRoutines::call_stub 0x00007f392ae6072e
Since we need to backport this fix to JDK 25, I think we should go with the `vm_direct_exit()` now, and perhaps fix the AOT dumping code to avoid zeroing out VM states in a follow-up RFE.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/26008#discussion_r2174357577
More information about the hotspot-runtime-dev
mailing list