SIgnal chaining, JVM_handle_(linux|bsd|aix)_signal, and backward compatibility
Thomas Stüfe
thomas.stuefe at gmail.com
Tue Nov 3 09:05:37 UTC 2020
On Tue, Nov 3, 2020 at 9:56 AM David Holmes <david.holmes at oracle.com> wrote:
> On 3/11/2020 3:42 pm, Thomas Stüfe wrote:
> > Hi David,
> >
> > On Tue, Nov 3, 2020 at 4:27 AM David Holmes <david.holmes at oracle.com
> > <mailto:david.holmes at oracle.com>> wrote:
> >
> > Hi Thomas,
> >
> > On 2/11/2020 5:40 pm, Thomas Stüfe wrote:
> > > Hi,
> > >
> > > While working on some signal handler fixes and cleanups ([1]), I
> > noticed
> > > that we export JVM_handle_(linux|bsd)_signal() on all POSIX
> > platforms.
> > >
> > > I wondered why we do this. See also my first question at
> > runtime-dev [2] -
> > > the initial reaction was "probably no reason anymore".
> > >
> > > But then, I see a carefully crafted comment in [3]:
> > >
> > > <quote>
> > > // This routine may be used by user applications as a "hook" to
> catch
> > > signals.
> > > // The user-defined signal handler must pass unrecognized signals
> > to this
> > > // routine
> > > ...
> > > </quote>
> > >
> > > and I also found some bug reports [4], [5], from 2001 and 2004:
> > > JDK-4864136 : "JVM_handle_linux_signal is private in 1.4.2-beta"
> > > JDK-4408646 : "JVM_handle_solaris_signal must be a global
> function"
> > >
> > > So, to me this looks like JVM_handle_xxx_signals() was an
> > official, or at
> > > least deliberate, interface for foreign code to interact with our
> > signal
> > > handling. Specifically, this reads like a third party could
> > install its
> > > signal handlers over ours, and as long as it queried our signal
> > handler
> > > first by calling JVM_handle_xxx_signal(), we still could coexist
> > with it.
> >
> > Did you see this comment:
> >
> >
> https://bugs.openjdk.java.net/browse/JDK-4361067?focusedCommentId=12425956&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12425956
> >
> > This is tied to AllowUserSignalHandlers so we'd have to go through
> the
> > deprecation process to make any changes in this area.
> >
> >
> > Excellent point. Thanks for digging that up. Would love to know who
> > wrote this code + comment.
> >
> > (Note that I believe this comment uses the term "signal chaining" to
> > describe a general concept, not the solution implemented in hotspot
> > today via the UseSignalChaining + libjsig).
>
> Right.
>
>
> > Looked at this and a cursory glance through github sources shows that
> > this flag is used quite a lot, and I think it is useful too (I remember
> > Florian Weimers question about it here:
> > https://mail.openjdk.java.net/pipermail/discuss/2019-April/005046.html
> -
> > and this is exactly the answer, this would be the alternative).
> >
> > But man, this raises so many questions. I see that
> > AllowUserSignalHandlers had been implemented by just refusing to install
> > the central hotspot signal handler. But it's not honored for the
> > installation of the SR handler, nor for anything which gets installed
> > via os::signal(). So we have bitrot right there already. Then, it never
> > can have worked on AIX since we have a day zero bug in our port. And so
> > on...
>
> I wouldn't classify it like that. Hotspot commandeers very specific
> signals for internal purposes eg SEGV. If an application has its own
> custom SEGV handling then you need to coordinate between them. In
> contrast SR_signum is by default a user-defined signal and if that use
> conflicts with an app then it can redefine SR_signum. os::signal is for
> application defined signal handlers so again not specifically related to
> internal hotspot uses. There are a number of different aspects to signal
> handling. I think AllowUserSignalHandlers is fundamentally for things
> like SEGV handling.
>
Good point.
>
> > > But now we have signal chaining [6], which achieves the same by
> > preloading
> > > the libjsig library. Which some consider to be ewww [7] :-), but
> > still
> > > seems to me like the official solution to that problem. Using
> signal
> > > chaining there would be no need to manually call
> > JVM_handle_xxx_signal()
> > > from outside since the interposed sigaction() in libjsig would
> > take care of
> > > all this stuff automatically.
> >
> > Aren't the two mechanisms complementary in that they work in opposite
> > ways? Signal chaining keeps the VM in control and allows it to call
> > application handlers. AllowUserSignalHandlers keeps the app in
> control
> > and requires it to call VM handlers.
> >
> >
> > I don't believe so. I do not have the means to check it, but would not
> > be surprised if signal chaining was actually the later invention.
>
> Yes later but the two still operate in opposite ways.
>
> > We have three mechanisms:
> > A) AllowUserSignalHandlers, which just refuses to install signal
> > handlers (thereby effectively disabling UseSignalChaining), but user
> > handler needs to call our handler
>
> Yes this puts the application in charge and they have to know what
> signals the VM wants to know about and call our handlers.
>
> > B) UseSignalChaining, which handles the case where we come after the
> > user handler - we install our handlers, but remember the user handler
> > and later invoke it
> > C) libjsig interposition, which handles the case where the user handler
> > gets installed over our handler - libjsig prevents this and we end up
> > with the same signal handler chain as in (B)
> >
> > (B) and (C) (together termed "signal chaining" now) are a complete
> > replacement for (A). (A) is somewhat simpler, especially with the
> > interposition stuff, and gives the user handler more control, since it
> > can decide to handle signals before we get to see them. Whether or not
> > that's wise is another thing.
> >
> > Very probably (A) should not be used together with (C) or (B).
>
> (A) puts the host app in charge.
>
> (B) and (C) should be used together to let the application handlers take
> a subserviant role to the hotspot handlers. The JVM is in charge.
>
> If you had an existing application that performed signal management and
> then wanted to also embed a JVM you want to use (A), not have to rewrite
> your application so that (B)+(C) work. But in addition if you also allow
> native code that might have its own signal handling requirements then
> that part may require signal-chaining to work as well - so these
> mechanisms are not mutually exclusive.
>
>
Yes, well explained. I realize that we actually mean the same thing. So, I
will definitely not remove that flag nor the exports.
> >
> > > My question is:
> > > - if JVM_handle_xxx_signal() is (had been?) an official
> > interface, should
> > > there not be some official documentation and regression tests?
> > Was there? I
> > > could not find anything.
> > > - if it is not an official interface, would it be okay to lay it
> > to rest?
> > > - Pro: it is made obsolete by signal chaining, and removing it
> > would
> > > reduce complexity of signal handling somewhat
> > > - Con: applications may exist out there which use that
> > interface, and I'd
> > > hate to break them. We give customers enough reasons to cling to
> > old JDK
> > > versions as it is.
> > > Also, arguably, this interface is useful. Not everyone likes
> > > preloading libjsig, and the signal chaining mechanism is also a
> > whole lot
> > > more fragile.
> > >
> > > However, if we let it live on, should we not document and test it?
> >
> > It is not well documented that is for sure.
> >
> > Not sure about testing ... I tend to see this more a "best effort"
> > mechanism. If someone reports it stops working, or doesn't work, then
> > we'll make a best effort to fix it.
> >
> >
> > If it is "best effort", can I just stub them out and let them return 0
> > always then?
> > :-) Just kidding. This is the same argument as in "int random() {
> > return 0; } is technically valid".
> >
> > But seriously, tests would give me some measure of safety while working
> > with this code, as well as preventing bitrot or accidental removal. Just
> > look at the people queuing up to change or remove those handlers - see
> > e.g. https://github.com/openjdk/jdk/pull/636.
>
> 8253742 is supposed to be about cleanup/consolidation not removal!
>
>
He does not really remove them, just rename them, but the effect would have
been the same.
But there was a lot of confusion surrounding JVM_handle_xxx_signal. Since
documentation is almost non-existent, and there are no tests, how could we
know.
> Yes tests would do what you say. But writing applications that exercise
> these different signal management aspects would be rather complicated
> and always potentially incomplete. How would you write such a test? A
> customer launcher that acts as a hosting app which sets
> -XX:-AllowUserSignalHandlers and then installs its own handler that
> calls the JVM handler would not exactly prove anything. To see if it was
> working correctly you'd need to run all the regular tests under this
> hosting application.
>
It is difficult, yes. But some basic tests could be to
- test that this symbol is exported (could be a simple nm call)
- maybe, just call that thing with abort_if_unrecognized = false and some
fake infos which look like an unrelated SEGV the hotspot should ignore.
That could be done in a gtest in-vm.
I'll see if I can cook up something in the course of JDK-8255711.
Cheers, Thomas
> Cheers,
> David
> -----
>
> >
> > > I am really interested in the history of this. Was this just a
> > solution for
> > > a single customer?
> >
> > Not sure of specific details but it was for hosting the VM in native
> > applications that already did their own signal management.
> >
> > Cheers,
> > David
> >
> >
> > Thanks for the infos!
> >
> > ..Thomas
> >
> > >
> > > Thanks, Thomas
> > >
> > >
> > > [1] https://bugs.openjdk.java.net/browse/JDK-8255711
> > > [2]
> > >
> >
> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-October/043145.html
> > > [3]
> > >
> >
> https://github.com/openjdk/jdk/blob/64feeab70af61a52ffe4c64df87a33c16754de18/src/hotspot/os/posix/signals_posix.cpp#L411
> > > [4] https://bugs.openjdk.java.net/browse/JDK-4864136
> > > [5] https://bugs.openjdk.java.net/browse/JDK-4408646
> > > [6]
> > >
> >
> https://docs.oracle.com/javase/8/docs/technotes/guides/vm/signal-chaining.html
> > > [7]
> >
> https://mail.openjdk.java.net/pipermail/discuss/2019-April/005042.html
> > >
> >
>
More information about the jdk-dev
mailing list