RFR(s): 8144219: [posix] Remove redundant code around os::print_siginfo()

David Holmes david.holmes at oracle.com
Fri Dec 11 10:37:42 UTC 2015


On 11/12/2015 6:34 PM, Thomas Stüfe wrote:
> Hi David,
>
> New webrev.03:
> http://cr.openjdk.java.net/~stuefe/webrevs/8144219-posix-os-print_siginfo/webrev.03/webrev/
> And the delta to webrev.02:
> http://cr.openjdk.java.net/~stuefe/webrevs/8144219-posix-os-print_siginfo/webrev.03-base-02/webrev/index.html
>
> Just worked in your suggestions and nothing else.

Thanks - looks good!

> --
>
> On Solaris I found a curious behaviour, btw: there, when sending
> synchronous error signals from the outside via kill (e.g. kill -fpe
> <pid>), the vm immediately crashes with "Segmentation fault" as if no
> signal handler were installed. On Linux (and afaik other Unices), there
> is no difference between an internally generated and an externally sent
> synchronous signal - in both cases our signal handler jumps in an
> produces an hs-err pid. The only difference is that the si_code is
> different.
>
> I tested this with and without my change, both sides behave the same, so
> I think this is a Solaris speciality. Curious, nonetheless.

Does it depend on which thread actually has the signal delivered to it?

David

> Thanks, Thomas
>
>
> On Fri, Dec 11, 2015 at 2:49 AM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
>     Hi Thomas,
>
>     This all looks fine to me - just some minor nits below :). I'm
>     running this through JPRT to see if everything works okay. Still
>     need a second reviewer of course.
>
>     Nit: I find the notation:
>
>     const /* real-type */ void* siginfo
>
>     somewhat distracting - especially when we see siginfo being cast to
>     the correct type within the next two lines.
>
>     ---
>
>     src/os/posix/vm/os_posix.cpp
>
>     Nit: os->print(" (me)"); me -> current process ?
>
>     ---
>
>     src/share/vm/utilities/vmError.cpp
>
>     Nit: CDS. -> CDS archive.
>
>     ---
>
>     src/share/vm/utilities/vmError.hpp
>
>     Nit: CDS store -> CDS archive
>
>
>     Thanks,
>     David
>
>     On 10/12/2015 12:22 AM, Thomas Stüfe wrote:
>
>         Hi David,
>
>         thanks for the review!
>
>         my new webrev:
>         http://cr.openjdk.java.net/~stuefe/webrevs/8144219-posix-os-print_siginfo/webrev.02/webrev/
>
>         On Tue, Dec 8, 2015 at 7:57 AM, David Holmes
>         <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>         <mailto:david.holmes at oracle.com
>         <mailto:david.holmes at oracle.com>>> wrote:
>
>              Hi Thomas,
>
>              Sorry for the delay.
>
>              On 30/11/2015 8:03 PM, Thomas Stüfe wrote:
>
>                  Hi all,
>
>                  please review this small change:
>
>
>              Maybe not so small :)
>
>
>         Well. I did name it (s), not (xxs) :)
>
>
>
>                  Webrev:
>         http://cr.openjdk.java.net/~stuefe/webrevs/8144219-posix-os-print_siginfo/webrev.00/webrev/
>                  Bug Report:
>         https://bugs.openjdk.java.net/browse/JDK-8144219
>
>                  This is mainly a cleanup change for posix platforms. It
>         gets rid of
>                  multiple copies of os::print_siginfo() in favour of the
>         central
>                  os::Posix::print_siginfo() in os_posix.cpp.
>
>
>              Something seems not right. Currently vmError calls
>              os::print_siginfo, and the per-posix-platform
>         os::print_siginfo then
>              call os::Posix::print_siginfo_brief. And now you have
>              os::Posix::print_siginfo in os_posix.cpp, but that should be
>              os::print_siginfo - no Posix namespace ??
>
>
>         There is os::print_siginfo(outputStream*, void*), for which
>         exist two
>         implementations, one for Windows, one for Posix. In my first
>         webrev I
>         also kept os::Posix::print_siginfo(outputStream*, siginfo_t*),
>         because I
>         wanted a version which takes a real siginfo_t*.
>         os::print_siginfo(outputStream*, void*) for posix just called
>         os::Posix::print_siginfo(outputStream*, siginfo_t*).
>
>         To simplify things, I removed the os::Posix::print_siginfo(). Now we
>         just have os::print_siginfo(outputStream*, const void*) in two
>         variants,
>         one for windows, one for posix.
>
>         (I also made the pointer const for const correctness)
>
>                  It also extends os::Posix::print_siginfo() to write
>         more details
>                  about the
>                  received signal which stems from our experiences with
>         various
>                  Unices over
>                  the time. This will improve hs-err file printout.
>
>
>              The old code was deliberately intended as a simple
>         one-liner for the
>              error report. But okay ....
>
>              987   os->print_cr("siginfo @ " PTR_FORMAT ":", p2i(si));
>
>              What is the value in printing the pointer to the siginfo_t ??
>
>              I personally prefer to see si_signo, si_code and si_errno to be
>              clear that what is printed are the fields of the siginfo_t.
>
>              1030 #ifdef _AIX
>              1031   } else if (si->si_code == SI_EMPTY) {
>              1032     os->print_cr("  Signal may have been sent by
>         another thread.");
>              1033 #endif
>              1034 #ifdef __linux
>              1035   } else if (si->si_code == SI_TKILL) {
>              1036     os->print_cr("  Signal has been sent by another
>         thread.");
>              1037 #endif
>              1038 #ifdef __sun
>              1039   } else if (si->si_code == SI_LWP) {
>              1040     os->print_cr("  Signal has been sent by another
>         thread.");
>              1041 #endif
>              1042   } else if (sig == SIGSEGV || sig == SIGBUS) {
>              1043     os->print_cr("  Address of faulting memory
>         reference: "
>              PTR_FORMAT, p2i(si->si_addr));
>              1044   } else if (sig == SIGILL || sig == SIGFPE) {
>              1045     os->print_cr("  Address of faulting instruction: "
>              PTR_FORMAT, p2i(si->si_addr));
>              1046   } else if (sig == SIGTRAP) {
>              1047     os->print_cr("  Address of trap instruction: "
>         PTR_FORMAT,
>              p2i(si->si_addr));
>              1048 #ifndef __APPLE__
>              1049   } else if (sig == SIGPOLL) {
>              1050     os->print_cr("  Band event: " PTR64_FORMAT,
>              (uint64_t)si->si_band);
>              1051 #endif
>
>              These should be conditional on the specific signal or value
>         being
>              defined, not platform specific. So the __APPLE__  goes back to
>              SIGPOLL (even if it is always defined you'll never receive
>         it on
>              platforms that don't use it). The others should check
>         SI_EMPTY etc.
>
>              Note that there are a number of other possibilities just on
>         Solaris:
>
>         https://docs.oracle.com/cd/E23823_01/html/816-5173/siginfo-3head.html
>
>              That said we don't need to give this much detail in my
>         opinion. If
>              you get a si_code of SI_LWP for example then you can go an
>         look at
>              the manpage to find out exactly what it means - we don't
>         want to be
>              maintaining that information here. So I'm not sure we really
>              need/want this expanded level of detail.
>
>
>         Okay. I think we (SAP) do not care so much about the size of the
>         hs-err
>         file and trade a bit more verbosity in return for giving more
>         detailled
>         information to our support people. But I can see your point, and
>         it is a
>         matter of preferences.
>
>         In the new version, I kept the output of os::print_siginfo very
>         compact
>         and on one line. We print si_signo, si_code unconditionally,
>         si_errno if
>         set, and then decide depending on the code whether to print
>         si_addr or
>         si_pid/si_uid or si_band.
>
>
>                  Note that there was on all platforms code which
>         examines the
>                  crash address
>                  for a possible CDS fault and prints a specialized hint.
>         In my
>                  opinion that
>                  makes no sense in a general os::print_siginfo()
>         function and I
>                  moved it to
>                  a separate error reporting step into vmError.cpp. This
>         may be a
>                  point of
>                  contention because the way to extract the fault address
>         from a
>                  siginfo/EXCEPTION_RECORD structure is platform
>         dependend, but
>                  imho this is
>                  still the least ugly solution.
>
>
>              Sorry I don't like seeing the _WIN32 condition in the
>         vmError.cpp
>              code. By keeping this as part of the print_siginfo the platform
>              specific aspect is handled transparently. As something that
>         is only
>              used as part of error reporting, and which is directly
>         related to a
>              signal being received the current placement doesn't seem
>              unreasonable. If you really want to factor this out then I
>         would add
>              os::check_for_failing_cds_access() or something like that.
>
>
>         I really did not like this in os::print_siginfo(). One does not
>         expect
>         os::print_siginfo() to know about cds specialities, and to know
>         that it
>         only gets called in the context of error handling (who says this is
>         always true?)
>
>         But I agree the platform dependend code in vmError.cpp is ugly.
>         I moved
>         it to vmError_posix.cpp and vmError_windows.cpp respectively, and
>         renamed it per your suggestion. I did not want to add this to
>         the os::
>         namespace because it is really a highly specialized function
>         which only
>         makes sense in error handling.
>
>         Thanks & Regards, Thomas
>
>              Thanks,
>              David
>              -----
>
>                  Kind regards, Thomas
>
>
>


More information about the hotspot-runtime-dev mailing list