RFR: JDK-8255711: Fix and unify hotspot signal handlers

David Holmes david.holmes at oracle.com
Thu Nov 5 23:57:30 UTC 2020


Hi Thomas,

On 5/11/2020 5:32 pm, Thomas Stüfe wrote:
> Hi David,
> 
> <trimming and combining your mails>
> 
> 
>      >      > 10) When invoking the fatal error handler, we extract the
>     pc from
>      >     the context and hand it over as "faulting pc". For SIGILL and
>      >     SIGFPE, this is not totally correct. According to POSIX [3], for
>      >     those signals the address of the faulting instruction is
>     handed over
>      >     in `si_info.si_addr`.
>      >      >
>      >      > On most platforms this does not matter, they are the same.
>     But on
>      >     some architectures the pc in the signal context actually points
>      >     somewhere else, e.g. beyond the faulting instruction. Therefore
>      >     `si_info.si_addr` is the better choice.
>      >
>      >     The Posix spec also states "For some implementations, the
>     value of
>      >     si_addr may be inaccurate." - so I'm not at all sure which
>     "pc" we
>      >     should be trusting here? I thought the ucontext was the detailed
>      >     platform specific "context" object that we should extract
>     information
>      >     from. Which architectures give different values in the two
>     and is there
>      >     some documentation stating what happens for any given os/cpu?
>      >
>      >
>      > I overlooked this. Well, this is a helpful standard :)
>      >
>      > I saw this happen on s390 and on pa-risc. Before this patch, I did
>      > correct this in the platform handler. Out of caution I could
>     #ifdef this
>      > section to s390.
> 
>     I prefer not see any change in behaviour for platforms where no problem
>     has been observed. Also see what seems to be related comment in
>     ./cpu/arm/vm_version_arm_32.cpp
> 
> 
> Okay, made this section s390 only.
> 
> Arm may or may not have the same problem. Their signal handler is in 
> parts using si_addr, partly the context pc. But since you want to keep 
> old behavior, let's keep it for now.
> 
> I still think si_addr would be more correct and simpler, but we can 
> leave this for a follow up fix.

Okay. Thanks.

>     // JVM_handle_linux_signal moves PC here if SIGILL happens
> 
>      >      > ----
>      >      >
>      >      > The changes in this patch:
>      >      >
>      >      > a) hotspot signal handling is now done by the following
>     functions:
>      >      >
>      >      > <OS>                 <foreign code, probably signal
>     handler too>
>      >      >    |                           |
>      >      >    v                           v
>      >      >   javaSignalHandler       JVM_handle_linux_signal()
>      >      >         |                   /
>      >      >         v                  v
>      >      >       javaSignalHandler_inner
>      >
>      >     Not clear why we need the _inner version. Why can't we just have
>      >     javaSignalHandler which is installed as the handler and which
>     is called
>      >     by JVM_handle_XXX_signal?
>      >
>      >
>      > Because JVM_handle_XXX_signal has one more argument than the
>     standard
>      > signal handler (abort_if_unrecognized).
> 
>     Okay so why introduce this shape instead of keeping the existing form:
> 
>     javaSignalHandler -> JVM_handle_xxx_signal(..., true) ->
>     javaSignalHandler_inner
> 
>     ? With the new arrangement the equivalence between javaSignalHandler
>     and
>     JVM_handle_xxx_signal can only be seen by inspecting the code of both.
> 
> 
> I see now what you mean.
> 
> To my mind, JVM_handle_linux_signal() is an external API with a contract 
> which does not necessarily correspond with 
> what javaSignalHandler_inner() is about to do. Its existence 
> demonstrates that (as an explicit side door into the signal handler 
> hierarchy), and that door can be guarded and adorned with pre- and 
> post-processing if needed.

In an abstract sense yes, but today javaSignalHandler does call 
JVM_handle_xxx_signal and it is that implementation that we have 
factored out into javaSignalHandler_inner.

> For example, the contract says that the user shall not pass anything 
> other than the given list of signals. Well, we can assert it here more 
> clearly. So users could use a debug VM to test their application. It 
> also means we can, for the release case, safely shortcut signals it 
> should ignore. That isolates this interface a bit from any future 
> changes to the hotspot signal handlers (because no-one ever thinks about 
> this stuff when working with the handlers).

I think you may be making this "contract" a bit more concrete than it 
actually is. The large comments block states:

// The user-defined signal handler must pass unrecognized signals to this
// routine,

which doesn't suggest only a specific sub-set. It also states:

// This routine may recognize any of the following kinds of signals:
//    SIGBUS, SIGSEGV, SIGILL, SIGFPE, SIGQUIT, SIGPIPE, SIGXFSZ, SIGUSR1.

but not that only says "may" - it is not IMO intended to be definitive, 
only indicative.

So to me the application handler would be free to pass all signals 
through (with abort_if_unrecognized==false) and use the return value to 
determine if this was a VM related signal.

> I know these are not hard arguments, it is a matter of taste. If you 
> insist, I change it in the way you prefer.
> 
>> > So I think JVM_handle_xx_signal() should test for the list of allowed
>> > signals, and just return false right away in case sig is none of the
>> > hotspot signals.
> 
>  > I don't think we should change existing behaviour here.
> 
> But is that not an error worth fixing? Before, someone could have passed 
> in 177 as a signal number and this would have crashed the VM if he also 
> passed in abort_if_unrecognized=true.

And this is exactly how it is supposed to work. There is a 
responsibility on the application code to either pass known good signals 
with abort_if_unrecognized, or else not set abort_if_unrecognized.

> This is a bit like my argument from above. I believe we have a clear 
> contract with this API. If that contract is broken, what should we do? I 
> prefer to assert, but in release case to be tolerant and ignore it. 
> Otherwise, what good is this contract if we cannot rely on it and don't 
> check it?

The "contract" includes the abort_if_unrecognized flag. If it is true 
then we enforce things, otherwise we don't.

>  > > // This routine may recognize any of the following kinds of signals:
>  > > //    SIGBUS, SIGSEGV, SIGILL, SIGFPE, SIGQUIT, SIGPIPE, SIGXFSZ, 
> SIGUSR1.
>  > > // It should be consulted by handlers for any of those signals.
>  > >
>  > > Note that this list is not really correct, as it includes SIGUSR1
>  > > and SIGQUIT. None of which are handled by the hotspot signal 
> handler. As
>  > > you wrote before, this mechanism is only to handle signals the hotspot
>  > > commandeers.
>  >
>  > SIGUSR1 is probably a leftover from when we did use SIGUSR1 for 
> something.
>  >
>  > SIGQUIT is interesting because if the app is in charge of signals then
>  > it will install a SIGQUIT handler and we would not want the "VM" to do
>  > it's normal SIGQUIT handler.
> 
> Hm, I can see it either way. SIGQUIT is for thread dumping. Whether or 
> not the application would want us to honor it we cannot say.

I'm always forgetting which of SIGQUIT and SIGINT do which (to me QUIT 
== exit the VM). But in both cases these are signals for which the JDK 
installs a handler.

> The current behavior would be to treat SIGQUIT as unknown signal (crash 
> or ignore). I would keep that behavior for now.
> 
>  > I have a big question mark over how the use
>  > of the signal handler thread can/should interact with
>  > AllowUserSignalHandlers.
> 
> What is the signal handler thread?

The signal handler thread is a Java thread that synchronously processes 
the signals that are managed by the os::user_handler(). But I realize 
now these signals can be controlled by -Xrs so that the application can 
manage them instead. (Though with no programmatic way to pass them on to 
the JVM AFAICS).

Thanks,
David
-----

> Thanks, Thomas


More information about the hotspot-dev mailing list