Misbehaving exit status from Hotspot

David Holmes david.holmes at oracle.com
Wed Jun 27 02:45:25 UTC 2018


Hi Charlie,

I don't know if you tried to attach your test programs but attachments 
get stripped. So just based on the descriptions ...

On 27/06/2018 5:48 AM, Charles Oliver Nutter wrote:
> A bit more background and info...
> 
> This investigation was spawned by a JRuby bug:
> https://github.com/jruby/jruby/issues/5224
> 
> Zhengyu Gu pointed out that -XX:+ReduceSignalUsage allows my gisted example
> to work as expected.
> 
> $ ./sigtest `which java`  -XX:+ReduceSignalUsage Loop
> pid: 28705
> status: 15
> exited: 0, stop signal: 0, term signal: 15, exit status: 0
> 
> That isn't too surprising to me, but it's also undesirable to have to pass
> this flag. Shouldn't Hotspot's shutdown handler be propagating SIGTERM to
> the system-default handler as its final step? It seems like that's the
> missing piece here, if I'm reading the tea leaves correctly.

We (the VM) only chain user-handlers for signals. We never call the 
default handler to effect an abort - we can't as signal handling is 
asynchronous: we just notify the signal thread that a signal was raised 
and it then dispatches it to the Java level. If the signal is 
unexpected/unhandled and leading to a crash then we generate the hs_err 
file and explicitly call either abort() or exit(1) depending on the 
desire for a core file.

SIGTERM is a termination signal for the JVM (SHUTDOWN2_SIGNAL - unless 
using -Xrs). It performs an orderly shutdown of the VM and exits. This 
is setup in:

src/java.base/windows/classes/java/lang/Terminator.java

It will cause execution of:

  Shutdown.exit(sig.getNumber() + 0200);

which performs the orderly shutdown (i.e it causes shutdown hooks to run 
to completion) and should then exit with that exit code. And the exit 
code is the expected SIG+128.

> Interestingly, just setting this flag and running JRuby is not enough to
> "fix" our bug...I also need to add a downcall to raise(3) as part of
> termination.
> 
> Bottom line for me: Hotspot is not being a good actor wrt exit statuses and
> signal handling, and it should at *least* produce valid exit conditions for
> the process when terminated prematurely by a signal.

Well it's not hotspot at fault if there is a "fault" here - unless 
there's some bug with the eventual process exit logic regarding the exit 
code. The signal has a handler installed and we invoke that handler 
which delegates to the Java shutdown logic as described. Hotspot doesn't 
try to second-guess what should happen after that.

 From your Ruby bug the issue seems to be that child processes are not 
being informed about the parent termination correctly. I can't really 
speak to that. Your quote from the libc manual is interesting:

   /* Now reraise the signal.  We reactivate the signal’s
      default handling, which is to terminate the process.
      We could just call exit or abort,
      but reraising the signal sets the return status
      from the process correctly. */

as it implies that exit/abort are broken if they don't set the process 
return status correctly! But it's not relevant to the regular SIGTERM 
case as we are not exiting from within a signal handler.

AFAICS things work as expected:

  > java Sleep &
[1] 18175
  > kill -TERM 18175
[1]+  Exit 143                java Sleep

Cheers,
David
-----

> - Charlie
> 
> On Tue, Jun 26, 2018 at 2:05 PM, Charles Oliver Nutter <headius at headius.com>
> wrote:
> 
>> (mods: previous version of this was sent without subscription complete;
>> disregard)
>>
>> Hello all!
>>
>> I've been struggling to fix some signal-handling issues in JRuby and I've
>> come to the determination that Hotspot is not being a good actor as far as
>> signals and exit statuses go.
>>
>> I've put together some C and Java code to demonstrate the problem. I could
>> have flaws in my understanding of signal handling and exit statuses.
>>
>> The test program just spawns a given command, waits for termination, and
>> uses the wait(2) W macros to parse out the process exit states.
>>
>> "loop.c" just loops.
>> "loop2.c" has the loop but also installs a TERM signal handler, closer in
>> behavior to shutdown hooks in Ruby and JVM.
>> "Loop.java" just loops.
>>
>> The first two produce the expected results...
>>
>> $ ./sigtest `pwd`/loop
>> pid: 22130
>> status: 15
>> exited: 0, stop signal: 0, term signal: 15, exit status: 0
>>
>> $ ./sigtest `pwd`/loop2
>> term received
>> pid: 22173
>> status: 15
>> exited: 0, stop signal: 0, term signal: 15, exit status: 0
>>
>> Java produces nonsense results...
>>
>> $ ./sigtest `which java` Loop
>> pid: 22136
>> status: 36608
>> exited: 1, stop signal: 143, term signal: 0, exit status: 143
>>
>> I have tried various combinations of using the sun.misc.Signal stuff,
>> doing a native downcall to raise(3), and so on...but nothing helps,
>> probably because the Hotspot's own TERM handler is swallowing or otherwise
>> mutilating the exit status.
>>
>> We do get bug reports about JRuby's exit statuses not making any sense. I
>> assumed it was our fault until this week.
>>
>> Help?
>>
>> - Charlie
>>
>>


More information about the hotspot-runtime-dev mailing list