Misbehaving exit status from Hotspot

Charles Oliver Nutter headius at headius.com
Wed Jun 27 15:33:28 UTC 2018


Oops, in my editing of the post I lost the link to sources. Perhaps this
will illustrate the problem I'm talking about a bit better!

https://gist.github.com/headius/b87bc50b488fd73e753cbc518550ae5f

- Charlie

On Tue, Jun 26, 2018, 21:45 David Holmes <david.holmes at oracle.com> wrote:

> Hi Charlie,
>
> I don't know if you tried to attach your test programs but attachments
> get stripped. So just based on the descriptions ...
>
> On 27/06/2018 5:48 AM, Charles Oliver Nutter wrote:
> > A bit more background and info...
> >
> > This investigation was spawned by a JRuby bug:
> > https://github.com/jruby/jruby/issues/5224
> >
> > Zhengyu Gu pointed out that -XX:+ReduceSignalUsage allows my gisted
> example
> > to work as expected.
> >
> > $ ./sigtest `which java`  -XX:+ReduceSignalUsage Loop
> > pid: 28705
> > status: 15
> > exited: 0, stop signal: 0, term signal: 15, exit status: 0
> >
> > That isn't too surprising to me, but it's also undesirable to have to
> pass
> > this flag. Shouldn't Hotspot's shutdown handler be propagating SIGTERM to
> > the system-default handler as its final step? It seems like that's the
> > missing piece here, if I'm reading the tea leaves correctly.
>
> We (the VM) only chain user-handlers for signals. We never call the
> default handler to effect an abort - we can't as signal handling is
> asynchronous: we just notify the signal thread that a signal was raised
> and it then dispatches it to the Java level. If the signal is
> unexpected/unhandled and leading to a crash then we generate the hs_err
> file and explicitly call either abort() or exit(1) depending on the
> desire for a core file.
>
> SIGTERM is a termination signal for the JVM (SHUTDOWN2_SIGNAL - unless
> using -Xrs). It performs an orderly shutdown of the VM and exits. This
> is setup in:
>
> src/java.base/windows/classes/java/lang/Terminator.java
>
> It will cause execution of:
>
>   Shutdown.exit(sig.getNumber() + 0200);
>
> which performs the orderly shutdown (i.e it causes shutdown hooks to run
> to completion) and should then exit with that exit code. And the exit
> code is the expected SIG+128.
>
> > Interestingly, just setting this flag and running JRuby is not enough to
> > "fix" our bug...I also need to add a downcall to raise(3) as part of
> > termination.
> >
> > Bottom line for me: Hotspot is not being a good actor wrt exit statuses
> and
> > signal handling, and it should at *least* produce valid exit conditions
> for
> > the process when terminated prematurely by a signal.
>
> Well it's not hotspot at fault if there is a "fault" here - unless
> there's some bug with the eventual process exit logic regarding the exit
> code. The signal has a handler installed and we invoke that handler
> which delegates to the Java shutdown logic as described. Hotspot doesn't
> try to second-guess what should happen after that.
>
>  From your Ruby bug the issue seems to be that child processes are not
> being informed about the parent termination correctly. I can't really
> speak to that. Your quote from the libc manual is interesting:
>
>    /* Now reraise the signal.  We reactivate the signal’s
>       default handling, which is to terminate the process.
>       We could just call exit or abort,
>       but reraising the signal sets the return status
>       from the process correctly. */
>
> as it implies that exit/abort are broken if they don't set the process
> return status correctly! But it's not relevant to the regular SIGTERM
> case as we are not exiting from within a signal handler.
>
> AFAICS things work as expected:
>
>   > java Sleep &
> [1] 18175
>   > kill -TERM 18175
> [1]+  Exit 143                java Sleep
>
> Cheers,
> David
> -----
>
> > - Charlie
> >
> > On Tue, Jun 26, 2018 at 2:05 PM, Charles Oliver Nutter <
> headius at headius.com>
> > wrote:
> >
> >> (mods: previous version of this was sent without subscription complete;
> >> disregard)
> >>
> >> Hello all!
> >>
> >> I've been struggling to fix some signal-handling issues in JRuby and
> I've
> >> come to the determination that Hotspot is not being a good actor as far
> as
> >> signals and exit statuses go.
> >>
> >> I've put together some C and Java code to demonstrate the problem. I
> could
> >> have flaws in my understanding of signal handling and exit statuses.
> >>
> >> The test program just spawns a given command, waits for termination, and
> >> uses the wait(2) W macros to parse out the process exit states.
> >>
> >> "loop.c" just loops.
> >> "loop2.c" has the loop but also installs a TERM signal handler, closer
> in
> >> behavior to shutdown hooks in Ruby and JVM.
> >> "Loop.java" just loops.
> >>
> >> The first two produce the expected results...
> >>
> >> $ ./sigtest `pwd`/loop
> >> pid: 22130
> >> status: 15
> >> exited: 0, stop signal: 0, term signal: 15, exit status: 0
> >>
> >> $ ./sigtest `pwd`/loop2
> >> term received
> >> pid: 22173
> >> status: 15
> >> exited: 0, stop signal: 0, term signal: 15, exit status: 0
> >>
> >> Java produces nonsense results...
> >>
> >> $ ./sigtest `which java` Loop
> >> pid: 22136
> >> status: 36608
> >> exited: 1, stop signal: 143, term signal: 0, exit status: 143
> >>
> >> I have tried various combinations of using the sun.misc.Signal stuff,
> >> doing a native downcall to raise(3), and so on...but nothing helps,
> >> probably because the Hotspot's own TERM handler is swallowing or
> otherwise
> >> mutilating the exit status.
> >>
> >> We do get bug reports about JRuby's exit statuses not making any sense.
> I
> >> assumed it was our fault until this week.
> >>
> >> Help?
> >>
> >> - Charlie
> >>
> >>
>


More information about the hotspot-runtime-dev mailing list