RFR: 8292695: SIGQUIT and jcmd attaching mechanism does not work with signal chaining library
Thomas Stuefe
stuefe at openjdk.org
Wed Sep 7 16:53:45 UTC 2022
On Wed, 7 Sep 2022 06:53:12 GMT, Man Cao <manc at openjdk.org> wrote:
>> Hi all,
>>
>> Could anyone review this bug fix? See https://bugs.openjdk.org/browse/JDK-8292695 for details.
>>
>> I changed the temporary handler for SIGQUIT to use a dummy function, and use `os::signal()` to set it up, just as `os::initialize_jdk_signal_support()` does.
>> It is possible that just moving the `set_signal_handler(BREAK_SIGNAL, false);` in `install_signal_handlers()` outside of the window bounded by `JVM_{begin|end}_signal_setting()` could also fix this bug. However, `set_signal_handler()` and `JVM_HANDLE_XXX_SIGNAL()` are currently used for signals that support chaining and periodically check, which do not apply to SIGQUIT. I think it is cleaner to use different functions for SIGQUIT.
>>
>> I also added a test to check that sending SIGQUIT should produce a thread dump on stdout, with and without using libjsig.so.
>>
>> -Man
>
> Thanks for filing the other bug. Responded in that bug.
Hi @caoman, @dholmes-ora, @navyxliu,
I wonder if it has to be that complicated.
The problems (both https://bugs.openjdk.org/browse/JDK-8279124 and https://bugs.openjdk.org/browse/JDK-8292695) are caused by SIGQUIT being in a weird space - it is a VM signal, really, but gets handled by the signal dispatcher thread. The signal dispatcher thread is started long after the "real" hotspot handlers are installed. And after the dispatcher thread starts, only then we install the SIGQUIT handler. At that point the libjsig is already in "surveillance" mode where it intercepts stuff.
But there is nothing that prevents us from moving the SIGQUIT handler installation up in time to the hotspot signal handler initialization. Nobody says the SIGQUIT handler has to be installed *after* the signal dispatcher thread started.
Signal dispatcher thread and handler are only loosely coupled via `os::signal_notify()`. If the VM receives SIGQUIT in the time window after the SIGQUIT handler is installed but before the signal dispatcher thread is started, handler will just "ring the bell", but thats fine. The notifications would stack, and once the dispatcher thread runs, it will process all these SIGQUIT signals in a delayed fashion.
To test my theory, I built: https://github.com/openjdk/jdk/compare/master...tstuefe:jdk:test-move-sigquit-init-up-in-time .
This moves signal handler installation up in time. It also optionally adds a delay between hotspot signal handler installation and dispatcher thread startup.
And this just works as expected: if we send SIGQUIT to the VM, we get thread dumps. Before dispatcher thread start, these SIGQUITs are "stored", and executed all once dispatcher thread starts (whether or not one wants that is a different question, but easily to modify). And SIGQUITs after dispatcher startup work fine as expected.
As expected this works with or without libjsig, so it solves JDK-8292695. It also closes the initial time window and therefore does not regress JDK-8279124. And it is a lot simpler to argue about, especially with libjsig present. Now SIGQUIT is handled like any other ordinary hotspot signal.
(and I wonder whether we could handle the shutdown signals the same way, but have not tested).
What do you think?
-------------
PR: https://git.openjdk.org/jdk/pull/9955
More information about the hotspot-runtime-dev
mailing list