Misbehaving exit status from Hotspot

Sat Jun 30 06:20:57 UTC 2018

I'm sympathetic with what you want to do, but I would be scared to just change
the behavior of the VM in response to a signal, even in this relatively innocuous
way.  Surely something out there will break.

My next thought is to throw in a -XX:+DoSignalsCharliesWay flag (not its real
name).  The objection to that is that it adds an obscure corner to our testing
matrix.  Not insuperable, but it's not something our test harnesses are well
designed for.

BTW, why doesn't -Xrs work for you?  That's certainly closer to the mark.
Is it that you want to run some JRuby shutdown hooks and then trap out
(rather than exit)?  If so, I suppose you want some sort of -Xrs0.5.

There's a bunch of programmable random logic having to do with signals
already in HotSpot.  Perhaps you could build a plug-in of some sort to do
the signal-hacking you want on the OS's you care about?  Then it would
be your test matrixes that would deal with it.  :-)

This might not do what you want, but the doc on the signal chaining feature
is an introduction to the world of HotSpot signals:
  https://docs.oracle.com/javase/8/docs/technotes/guides/vm/signal-chaining.html

Underneath this stuff is a very simple API called JVM_handle_linux_signal.
This is the 'secret identity' of all the JVM's signal handlers.  If you know this
identity, then perhaps you can set your own signal handler in its place,
and delegate everything to JVM_handle_linux_signal.  Here's the signal
handler HotSpot uses for *all signals*:

http://hg.openjdk.java.net/jdk/jdk/file/9f62267e79df/src/hotspot/os/linux/os_linux.cpp#l4474
static void signalHandler(int sig, siginfo_t* info, void* uc) {
  assert(info != NULL && uc != NULL, "it must be old kernel");
  int orig_errno = errno;  // Preserve errno value over signal handler.
  JVM_handle_linux_signal(sig, info, uc, true);
  errno = orig_errno;
}

You'll have to cut-n-paste the errno-swapping bits of code, which by rights
should have been placed inside JVM_handle_linux_signal.  And you'll
have to throw a switch like -XX:+AllowUserSignalHandlers to make the
JVM stop looking over your shoulder.  Then if you can smuggle some
native code into the JVM startup sequence, you can install your own
signal handler over the top of the JVM's that watches for SIGTERM
and does what you want instead of what the JVM does.

Sorry it's not a simpler answer…

— John

On Jun 29, 2018, at 9:41 PM, Charles Oliver Nutter <headius at headius.com> wrote:
> 
> On 06/29/2018 01:51 AM, David Holmes wrote:
>> "Such a handler should end by specifying the default action for the signal that happened and then reraising it; this will cause the program to terminate with that signal, as if it had not had a handler."
> 
> Yes, this is really what set me down the path of wishing Hotspot would
> do the same thing. This and the fact that CRuby does it, and I can't
> fit into certain CRuby deployments because JRuby can't emulate the
> signal results.
> 
> On Fri, Jun 29, 2018 at 4:50 AM, Florian Weimer <fweimer at redhat.com> wrote:
>> The advice seems appropriate to me for handlers that lead to termination, as generally intended for these signals.  SIGQUIT doesn't do that for the JVM, so the advice doesn't apply.  SIGTERM appears to do so.  So why not preserve in the information that the process was shut down by SIGTERM by reraising the signal?  This might confer useful information to the caller.
> 
> You've got my vote!
> 
> I think it's worth enumerating the pros and cons, eh?
> 
> Con:
> 
> * waitpid-related macros would now show termination due to a signal
> rather than a normal exit.
> 
> Pro:
> 
> * waitpid-related macros would now show termination due to a signal
> rather than a normal exit.
> * They'd also show the actual signal value in termsig rather than the 128+N.
> * The actual command line exit could would remain unchanged.
> 
> So...taking this another step...
> 
> Currently, you can *only* rely on the command line exit code, because
> the  watipid macros just say it was a normal exit, and the rest of
> their values are nonsense. So anyone writing process-management stuff
> for Hotspot subprocesses can only use the exit code (128+N) to detect
> that the exit was due to TERM.
> 
> And if we changed it? Well, the above would continue to work exactly
> as it does now, but folks expecting GNU-like TERM handling
> (propagation to default handler) would suddenly start to work with
> Hotspot.
> 
> Obviously I'm in favor of this, so I'd like to understand what this
> change would break. It seems like a net positive.
> 
> - Charlie