RFR: 8273608: Deadlock when jcmd of OnError attaches to itself [v8]
Thomas Stuefe
stuefe at openjdk.java.net
Wed Oct 13 04:44:52 UTC 2021
On Tue, 12 Oct 2021 07:02:12 GMT, Xin Liu <xliu at openjdk.org> wrote:
>> This patch allows the custom commands of OnError to attach to HotSpot itself.
>> It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd).
>> This prevents cmds which require safepoint synchronization from deadlock.
>> eg. OnError='jcmd %p Thread.print'.
>>
>> Without this patch, we will encounter a deadlock at safepoint synchronization.
>> `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`.
>>
>>
>> Aborting due to java.lang.OutOfMemoryError: Java heap space
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> # Internal Error (debug.cpp:364), pid=94632, tid=94633
>> # fatal error: OutOfMemory encountered: Java heap space
>> #
>> # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk)
>> # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
>> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
>> #
>> # An error report file with more information is saved as:
>> # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log
>> #
>> # -XX:OnError="jcmd %p Thread.print"
>> # Executing /bin/sh -c "jcmd 94632 Thread.print" ...
>> 94632:
>> [10.616s][warning][safepoint]
>> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected:
>> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint.
>> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint:
>> [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000]
>> [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE
>> [10.616s][warning][safepoint]
>> [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list)
>
> Xin Liu has updated the pull request incrementally with one additional commit since the last revision:
>
> Change to VM unconditionally as long as current thread is JavaThread.
>
> Hoist VMErrorForceNative out of While.
This is almost good to me. Thanks for refraining from the mutex unlock.
Wrt the tests, David is right, test it with some signals. But IMHO a manual test is safe enough here.
Cheers, Thomas
src/hotspot/share/utilities/vmError.cpp line 1646:
> 1644: // at safepoints.
> 1645: VMErrorForceInNative fn(Thread::current_or_null());
> 1646:
I have similar concerns as David about printing to acquire locks. Even if that is not a problem with tty specifically, its something to keep in mind (e.g. with the recent attempt to print logging via network sockets, such things can creep in).
I would in that case move the RAII object close around fork_and_exec. Only do this for fork, and undo it after returning, doing the printing stuff in the original state instead.
-------------
Changes requested by stuefe (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/5590
More information about the hotspot-dev
mailing list