RFR: 8279124: VirtualMachineImpl should check signal handler has installed before sending SIGQUIT
David Holmes
dholmes at openjdk.java.net
Tue Jan 18 08:02:24 UTC 2022
On Mon, 17 Jan 2022 09:00:25 GMT, Kevin Walls <kevinw at openjdk.org> wrote:
>> Hi @navyxliu,
>>
>> nice catch. I can see how this can be annoying.
>>
>> I propose a simpler and more robust way to fix it though. We (1) set up general hotspot signal handling very early, then (2) proceed to initialize the heap - which you have shown can take some time - then (3) set up SIGQUIT handling. We core if we get a quit signal before (3).
>>
>> I would add SIGQUIT handling to the general signal handler too, just to cover the time frame between (1) and (3). At (3), it would be overwritten, but that would be fine. Before (3), we could just ignore SIGQUIT, or print out some generic information (I assume thread dumps are not yet possible at this stage).
>>
>> Since the documented behavior for the JVM is to threaddump on SIGQUIT unless we run with -Xrs, I think this is acceptable behavior. Not printing thread dump or only printing partial information is preferable to quitting with core.
>>
>> Then, this would work for any client that sends sigquit to a JVM, not just those using the attach framework. And it would work on all Posix platforms, not just Linux. And we'd would not have to rely on parsing the proc fs.
>>
>> Als note that a solution implemented in the client as you did has the disadvantage that I need a modern jcmd for it to work. However, I often just use whatever jcmd is in the path. Better to handle this in the receiving hotspot.
>>
>> I sketched out a simple patch to test if what I propose can work:
>>
>>
>> diff --git a/src/hotspot/os/posix/signals_posix.cpp b/src/hotspot/os/posix/signals_posix.cpp
>> index 2c020a79408..3f0dd91e03b 100644
>> --- a/src/hotspot/os/posix/signals_posix.cpp
>> +++ b/src/hotspot/os/posix/signals_posix.cpp
>> @@ -615,6 +615,11 @@ int JVM_HANDLE_XXX_SIGNAL(int sig, siginfo_t* info,
>> #endif // ZERO
>> }
>>
>> +if (sig == BREAK_SIGNAL) {
>> + printf("too early for thread dumps...\n");
>> + signal_was_handled = true;
>> +}
>> +
>> // Ignore SIGPIPE and SIGXFSZ (4229104, 6499219).
>> if (!signal_was_handled &&
>> (sig == SIGPIPE || sig == SIGXFSZ)) {
>> @@ -1279,6 +1284,7 @@ void install_signal_handlers() {
>> set_signal_handler(SIGFPE);
>> PPC64_ONLY(set_signal_handler(SIGTRAP);)
>> set_signal_handler(SIGXFSZ);
>> + set_signal_handler(BREAK_SIGNAL); // just for initialization phase
>>
>> It still misses a number of things (I did not check signal mask setup and ReduceSignalUsage must be handled too), but it shows the general direction and worked as expected.
>>
>> Cheers, Thomas
>
>> I propose a simpler and more robust way to fix it though
>
> Great, this is the kind of thing I was heading towards with the conversation in the bug text. Although not sure why I could not reproduce the problem, with various different JDK versions.
Apologies @kevinjwalls as I also missed the discussion in the JBS issue.
Ideally we would know if the target VM is ready before we send the SIGQUIT to attach - e.g. writing a well-known file that the attach mechanism looks for before trying to attach. That seems feasible but perhaps costly and not always possible(?) ...
What has been presented here is a side-channel way of knowing. In that respect I like this. It is a pity it is Linux only.
The alternative suggestions of just making the window during which SIGQUIT terminates the VM process small enough to be un-hittable, also has merit. I don't think we have to try and accommodate extreme code that just looks up a process in the process table and throws a signal at it. Ignoring SIGQUIT during the early VM startup seems a reasonable solutions (we can't produce a thread dump at that time anyway). It seems to me that we can simply install UserHandler for SIGQUIT very early in the VM initialization process and it will be a no-op until the JDK signal handling is properly initialized (just need to fix one assert).
Cheers,
David
-------------
PR: https://git.openjdk.java.net/jdk/pull/7003
More information about the serviceability-dev
mailing list