RFR: 8279124: VM does not handle SIGQUIT during initialization [v3]
Kevin Walls
kevinw at openjdk.java.net
Wed Jan 19 12:42:28 UTC 2022
On Wed, 19 Jan 2022 09:10:56 GMT, Xin Liu <xliu at openjdk.org> wrote:
>> In early stage of initialization, HotSpot doesn't handle SIGQUIT. The default signal preposition on Linux is to quit the process and generate coredump.
>>
>> There are 2 applications for this signal.
>> 1. There's a handshake protocol between sun.tools.attach and HotSpot. VirtualMachineImpl sends SIGQUIT(3) if the AttachListener has not been initialized. It expects "Signal Dispatcher" to handle SIGQUIT and create AttachListener.
>> 2. POSIX systems use SIGQUIT as SIGBREAK. After AttachListener is up, HotSpot will reinterpret the signal for thread dump.
>>
>> It is possible that HotSpot is still initializing in Threads::create_vm() when SIGQUIT arrives. We should change JVM_HANDLE_XXX_SIGNAL to catch SIGQUIT and ignore it. It is installed os::init_2() and should cover the early stage of initialization. Later on, os::initialize_jdk_signal_support() still overwrites the signal handler of SIGQUIT if ReduceSignalUsage is false(default).
>>
>> Testing
>>
>> Before, this patch, once initialization takes long time, jcmd may quit the java process.
>>
>> $java -Xms64g -XX:+AlwaysPreTouch -Xlog:gc+heap=debug:stderr -XX:ParallelGCThreads=1 &
>> [1] 9589
>> [0.028s][debug][gc,heap] Minimum heap 68719476736 Initial heap 68719476736 Maximum heap 68719476736
>> [0.034s][debug][gc,heap] Running G1 PreTouch with 1 workers for 16384 work units pre-touching 68719476736B.
>> $jcmd 9589 VM.flags
>> 9589:
>> [1] + 9589 quit java -Xms64g -XX:+AlwaysPreTouch -Xlog:gc+heap=debug:stderr
>> java.io.IOException: No such process
>> at jdk.attach/sun.tools.attach.VirtualMachineImpl.sendQuitTo(Native Method)
>> at jdk.attach/sun.tools.attach.VirtualMachineImpl.<init>(VirtualMachineImpl.java:100)
>> at jdk.attach/sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58)
>> at jdk.attach/com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207)
>> at jdk.jcmd/sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:113)
>> at jdk.jcmd/sun.tools.jcmd.JCmd.main(JCmd.java:97)
>>
>>
>> With this patch, neither jcmd nor kill -3 will disrupt java process 45850.
>>
>> $java -Xms64g -XX:+AlwaysPreTouch -XX:ParallelGCThreads=1 -Xlog:os+init=debug:stderr -version &
>> [1] 45850
>> $ kill -3 45850
>> [6.902s][info][os,init] ignore BREAK_SIGNAL in the initialization phase.
>> $jcmd 45850 VM.flags
>> 45850:
>> [19.920s][info][os,init] ignore BREAK_SIGNAL in the initialization phase.
>> [25.422s][info][os,init] ignore BREAK_SIGNAL in the initialization phase.
>> [26.522s][info][os,init] ignore BREAK_SIGNAL in the initialization phase.
>> [27.723s][info][os,init] ignore BREAK_SIGNAL in the initialization phase.
>> [29.023s][info][os,init] ignore BREAK_SIGNAL in the initialization phase.
>> [30.423s][info][os,init] ignore BREAK_SIGNAL in the initialization phase.
>> com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file /proc/45850/root/tmp/.java_pid45850: target process 45850 doesn't respond within 10500ms or HotSpot VM not loaded
>> at jdk.attach/sun.tools.attach.VirtualMachineImpl.<init>(VirtualMachineImpl.java:105)
>> at jdk.attach/sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58)
>> at jdk.attach/com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207)
>> at jdk.jcmd/sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:113)
>> at jdk.jcmd/sun.tools.jcmd.JCmd.main(JCmd.java:97)
>> $openjdk version "19-internal" 2022-09-20
>> OpenJDK Runtime Environment (fastdebug build 19-internal+0-adhoc.xxinliu.jdk)
>> OpenJDK 64-Bit Server VM (fastdebug build 19-internal+0-adhoc.xxinliu.jdk, mixed mode, sharing)
>> [1] + 45850 done java -Xms64g -XX:+AlwaysPreTouch -XX:ParallelGCThreads=1 -version
>
> Xin Liu has updated the pull request incrementally with one additional commit since the last revision:
>
> Only install JVM_HANDLE_XXX_SIGNAL when ReduceSignalUsage is
> false(default value).
>
> This patch also adds a log message with 'os+init=info'.
Hi -- thanks for updating the bug title and text. Yes it's much better to start with a concise problem description. I'm in favour of the signal hander change. I'm not personally concerned about printing, silently handling SIGQUIT seems fine for a VM at this stage, perhaps printing just adds risk.
Still curious that I don't reproduce the problem by making heap initialization slow with options like -Xms100g -XX:+AlwaysPreTouch as you could. Startup can be so slow I can attach gdb and see it's in:
Threads::create_vm / init_globals / universe_init / G1CollectedHeap::initialize / ...etc...
...but jcmd or kill -QUIT don't hurt my JVM. 8-)
That process' /proc/PID/status contains:
SigIgn: 0000000000000006
SigCgt: 0000000180000000
...so that I think has signals 2 and 3 ignored? (Ubuntu)
Elsewhere I used Oracle Linux under Windows Services for Linux, and SigXXX fields in /proc/PID/status are all zeroes, not sure if they are meaningful there. Possibly another reason to handle this with the signal handler change.
On a real OracleLinux install I do see :
SigIgn: 0000000000000006
SigCgt: 0000000000000000
at startup become:
SigIgn: 0000000000000006
SigCgt: 0000000181001cc8
..after some seconds. But I still can't trigger the issue, there are some signals ignored there also.
So I like the change but would like to be clearer where the problem exists, where (what platforms?) can we see no signals ignored or caught at startup, and trigger the problem of crashing the VM with SIGQUIT.
-------------
PR: https://git.openjdk.java.net/jdk/pull/7003
More information about the serviceability-dev
mailing list