RFR: 8279124: VirtualMachineImpl should check signal handler has installed before sending SIGQUIT [v2]
Thomas Stuefe
stuefe at openjdk.java.net
Wed Jan 19 06:57:29 UTC 2022
On Wed, 19 Jan 2022 05:21:04 GMT, Xin Liu <xliu at openjdk.org> wrote:
>> There is a handshake protocol between attach and HotSpot. Linux VirtualMachineImpl sends SIGQUIT(3) if the AttachListener has not been initialized. It expects "Signal Dispatcher" to handle SIGBREAK(same as SIGQUIT) and create AttachListener. However, it is possible that attach starts "handshake" before os::initialize_jdk_signal_support() is called. The signal handler which handles SIGQUIT has not been installed. Prior to os::initialize_jdk_signal_support(), universe_init() is called. Its time complexity will be linear to the initial size of heap with 'AlwaysPreTouch'. It takes 20~30 seconds to initialize 128g heap on a server-class host(eg. EC2 m4.10xlarge). Many tools such jcmd, jstack etc may force initializing HotSpot quit prematurely.
>>
>> This patch checks '/proc/$pid/stat' SigCgt bitmask to ensure the signal will be caught by the target process before striking it with SIGQUIT. It will make HotSpot more robust. The fields of procfs are well [documented](https://www.kernel.org/doc/html/latest/filesystems/proc.html#id10) and have supported since Linux 2.6.30. libattach.so will not the only consumer of it. I see that os_perf_linux.cpp supports it in libjvm.so.
>>
>>
>> Testing
>>
>> Before, this patch, once initialization takes long time, jcmd may quit the java process.
>>
>> $java -Xms64g -XX:+AlwaysPreTouch -Xlog:gc+heap=debug:stderr -XX:ParallelGCThreads=1 &
>> [1] 9589
>> [0.028s][debug][gc,heap] Minimum heap 68719476736 Initial heap 68719476736 Maximum heap 68719476736
>> [0.034s][debug][gc,heap] Running G1 PreTouch with 1 workers for 16384 work units pre-touching 68719476736B.
>> $jcmd 9589 VM.flags
>> 9589:
>> [1] + 9589 quit java -Xms64g -XX:+AlwaysPreTouch -Xlog:gc+heap=debug:stderr
>> java.io.IOException: No such process
>> at jdk.attach/sun.tools.attach.VirtualMachineImpl.sendQuitTo(Native Method)
>> at jdk.attach/sun.tools.attach.VirtualMachineImpl.<init>(VirtualMachineImpl.java:100)
>> at jdk.attach/sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58)
>> at jdk.attach/com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207)
>> at jdk.jcmd/sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:113)
>> at jdk.jcmd/sun.tools.jcmd.JCmd.main(JCmd.java:97)
>>
>> With this patch, jcmd will timeout but won't disrupt 15274.
>>
>> $ java -Xms64g -XX:+AlwaysPreTouch -XX:ParallelGCThreads=1 &
>> [1] 15274
>> $ jcmd 15274 VM.flags
>> 15274:
>> com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file /proc/15274/root/tmp/.java_pid15274: target process 15274 doesn't respond within 10500ms or HotSpot VM not loaded
>> at jdk.attach/sun.tools.attach.VirtualMachineImpl.<init>(VirtualMachineImpl.java:105)
>> at jdk.attach/sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58)
>> at jdk.attach/com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207)
>> at jdk.jcmd/sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:113)
>> at jdk.jcmd/sun.tools.jcmd.JCmd.main(JCmd.java:97)
>
> Xin Liu has updated the pull request incrementally with one additional commit since the last revision:
>
> Intercept SIGQUIT in the early stage of HotSpot.
Hi Xin,
thanks for taking my suggestion. Remarks inline.
Cheers, Thomas
p.s. may make sense to reformulate the JBS issue. "VM does not handle SIGQUIT during initialization", and extending the error description to make clear this affects anyone sending SIGQUIT, not only jcmd.
src/hotspot/os/posix/signals_posix.cpp line 597:
> 595: #endif
> 596:
> 597: if (!signal_was_handled && sig == BREAK_SIGNAL) {
I would print out a little message, since callers of SIGQUIT may expect something to happen here. Maybe "Thread dumps not yet available"?
Also, just for clarity, assert(!ReduceSignalHandling)? (but we don't do that for the other signals we omit handling either, so maybe not)
src/hotspot/os/posix/signals_posix.cpp line 1291:
> 1289: // that an attach client accidentally forces HotSpot to quit prematurely.
> 1290: set_signal_handler(BREAK_SIGNAL);
> 1291:
I would do this only if ReduceSignalHandling is inactive, since in that case we must not overwrite a SIGQUIT handler the app may have installed before us. We should have parsed arguments already at this point, no?
-------------
Changes requested by stuefe (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/7003
More information about the serviceability-dev
mailing list