RFR: 8273608: Deadlock when jcmd of OnError attaches to itself
Xin Liu
xliu at openjdk.java.net
Wed Sep 22 07:41:57 UTC 2021
On Mon, 20 Sep 2021 22:02:37 GMT, Xin Liu <xliu at openjdk.org> wrote:
> This patch allows the custom commands of OnError to attach to HotSpot itself.
> It sets the thread of report_and_die() to Native before os::fork_and_exec(cmd).
> This prevents cmds which require safepoint synchronization from deadlock.
> eg. OnError='jcmd %p Thread.print'.
>
> Without this patch, we will encounter a deadlock at safepoint synchronization.
> `"main" #1` is the very thread which executes `os::fork_and_exec(cmd)`.
>
>
> Aborting due to java.lang.OutOfMemoryError: Java heap space
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> # Internal Error (debug.cpp:364), pid=94632, tid=94633
> # fatal error: OutOfMemory encountered: Java heap space
> #
> # JRE version: OpenJDK Runtime Environment (18.0) (build 18-internal+0-adhoc.xxinliu.jdk)
> # Java VM: OpenJDK 64-Bit Server VM (18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /local/home/xxinliu/JDK-2085/hs_err_pid94632.log
> #
> # -XX:OnError="jcmd %p Thread.print"
> # Executing /bin/sh -c "jcmd 94632 Thread.print" ...
> 94632:
> [10.616s][warning][safepoint]
> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timeout detected:
> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Timed out while spinning to reach a safepoint.
> [10.616s][warning][safepoint] # SafepointSynchronize::begin: Threads which did not reach the safepoint:
> [10.616s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=236.97ms elapsed=10.61s tid=0x00007f01b00232f0 nid=94633 runnable [0x00007f01b7a08000]
> [10.616s][warning][safepoint] java.lang.Thread.State: RUNNABLE
> [10.616s][warning][safepoint]
> [10.616s][warning][safepoint] # SafepointSynchronize::begin: (End of list)
> Can we limit this to the jcmd-attaches-to-me scenario? In general, the less we modify the VM state before core'ing the better. This distorts the picture and may confuse analysts of the hs-err file/core. I think we should do this only if necessary.
>
> Potentially, I would even limit it to OOM situations since for other types of errors (eg crashes) I do not see the point of attaching with jcmd. To prevent deadlock in those cases, one may just avoid calling jcmd altogether.
The only reason I try this because I would like to get heap dump when `-XX:AbortVMOnException=java.lang.OutOfMemoryError ` does trigger a fatal.
Indeed, I know we can get a core file and extract java heap from it. Some counter-arguments are: 1) core dump is subject to kernel and ulimit constraints. 2) filesize is too big 3) not secure. I come up an idea to use OnError=jcmd %p GC.heap_dump to simulate `HeapDumpOnOutOfMemoryError`.
if neither of you guys thinks it's a good idea, I can drop it. As you said, it will distort VMThread for sure.
-------------
PR: https://git.openjdk.java.net/jdk/pull/5590
More information about the hotspot-dev
mailing list