RFR(s): 8144219: [posix] Remove redundant code around os::print_siginfo()
David Holmes
david.holmes at oracle.com
Fri Dec 11 10:37:42 UTC 2015
On 11/12/2015 6:34 PM, Thomas Stüfe wrote:
> Hi David,
>
> New webrev.03:
> http://cr.openjdk.java.net/~stuefe/webrevs/8144219-posix-os-print_siginfo/webrev.03/webrev/
> And the delta to webrev.02:
> http://cr.openjdk.java.net/~stuefe/webrevs/8144219-posix-os-print_siginfo/webrev.03-base-02/webrev/index.html
>
> Just worked in your suggestions and nothing else.
Thanks - looks good!
> --
>
> On Solaris I found a curious behaviour, btw: there, when sending
> synchronous error signals from the outside via kill (e.g. kill -fpe
> <pid>), the vm immediately crashes with "Segmentation fault" as if no
> signal handler were installed. On Linux (and afaik other Unices), there
> is no difference between an internally generated and an externally sent
> synchronous signal - in both cases our signal handler jumps in an
> produces an hs-err pid. The only difference is that the si_code is
> different.
>
> I tested this with and without my change, both sides behave the same, so
> I think this is a Solaris speciality. Curious, nonetheless.
Does it depend on which thread actually has the signal delivered to it?
David
> Thanks, Thomas
>
>
> On Fri, Dec 11, 2015 at 2:49 AM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
> Hi Thomas,
>
> This all looks fine to me - just some minor nits below :). I'm
> running this through JPRT to see if everything works okay. Still
> need a second reviewer of course.
>
> Nit: I find the notation:
>
> const /* real-type */ void* siginfo
>
> somewhat distracting - especially when we see siginfo being cast to
> the correct type within the next two lines.
>
> ---
>
> src/os/posix/vm/os_posix.cpp
>
> Nit: os->print(" (me)"); me -> current process ?
>
> ---
>
> src/share/vm/utilities/vmError.cpp
>
> Nit: CDS. -> CDS archive.
>
> ---
>
> src/share/vm/utilities/vmError.hpp
>
> Nit: CDS store -> CDS archive
>
>
> Thanks,
> David
>
> On 10/12/2015 12:22 AM, Thomas Stüfe wrote:
>
> Hi David,
>
> thanks for the review!
>
> my new webrev:
> http://cr.openjdk.java.net/~stuefe/webrevs/8144219-posix-os-print_siginfo/webrev.02/webrev/
>
> On Tue, Dec 8, 2015 at 7:57 AM, David Holmes
> <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>> wrote:
>
> Hi Thomas,
>
> Sorry for the delay.
>
> On 30/11/2015 8:03 PM, Thomas Stüfe wrote:
>
> Hi all,
>
> please review this small change:
>
>
> Maybe not so small :)
>
>
> Well. I did name it (s), not (xxs) :)
>
>
>
> Webrev:
> http://cr.openjdk.java.net/~stuefe/webrevs/8144219-posix-os-print_siginfo/webrev.00/webrev/
> Bug Report:
> https://bugs.openjdk.java.net/browse/JDK-8144219
>
> This is mainly a cleanup change for posix platforms. It
> gets rid of
> multiple copies of os::print_siginfo() in favour of the
> central
> os::Posix::print_siginfo() in os_posix.cpp.
>
>
> Something seems not right. Currently vmError calls
> os::print_siginfo, and the per-posix-platform
> os::print_siginfo then
> call os::Posix::print_siginfo_brief. And now you have
> os::Posix::print_siginfo in os_posix.cpp, but that should be
> os::print_siginfo - no Posix namespace ??
>
>
> There is os::print_siginfo(outputStream*, void*), for which
> exist two
> implementations, one for Windows, one for Posix. In my first
> webrev I
> also kept os::Posix::print_siginfo(outputStream*, siginfo_t*),
> because I
> wanted a version which takes a real siginfo_t*.
> os::print_siginfo(outputStream*, void*) for posix just called
> os::Posix::print_siginfo(outputStream*, siginfo_t*).
>
> To simplify things, I removed the os::Posix::print_siginfo(). Now we
> just have os::print_siginfo(outputStream*, const void*) in two
> variants,
> one for windows, one for posix.
>
> (I also made the pointer const for const correctness)
>
> It also extends os::Posix::print_siginfo() to write
> more details
> about the
> received signal which stems from our experiences with
> various
> Unices over
> the time. This will improve hs-err file printout.
>
>
> The old code was deliberately intended as a simple
> one-liner for the
> error report. But okay ....
>
> 987 os->print_cr("siginfo @ " PTR_FORMAT ":", p2i(si));
>
> What is the value in printing the pointer to the siginfo_t ??
>
> I personally prefer to see si_signo, si_code and si_errno to be
> clear that what is printed are the fields of the siginfo_t.
>
> 1030 #ifdef _AIX
> 1031 } else if (si->si_code == SI_EMPTY) {
> 1032 os->print_cr(" Signal may have been sent by
> another thread.");
> 1033 #endif
> 1034 #ifdef __linux
> 1035 } else if (si->si_code == SI_TKILL) {
> 1036 os->print_cr(" Signal has been sent by another
> thread.");
> 1037 #endif
> 1038 #ifdef __sun
> 1039 } else if (si->si_code == SI_LWP) {
> 1040 os->print_cr(" Signal has been sent by another
> thread.");
> 1041 #endif
> 1042 } else if (sig == SIGSEGV || sig == SIGBUS) {
> 1043 os->print_cr(" Address of faulting memory
> reference: "
> PTR_FORMAT, p2i(si->si_addr));
> 1044 } else if (sig == SIGILL || sig == SIGFPE) {
> 1045 os->print_cr(" Address of faulting instruction: "
> PTR_FORMAT, p2i(si->si_addr));
> 1046 } else if (sig == SIGTRAP) {
> 1047 os->print_cr(" Address of trap instruction: "
> PTR_FORMAT,
> p2i(si->si_addr));
> 1048 #ifndef __APPLE__
> 1049 } else if (sig == SIGPOLL) {
> 1050 os->print_cr(" Band event: " PTR64_FORMAT,
> (uint64_t)si->si_band);
> 1051 #endif
>
> These should be conditional on the specific signal or value
> being
> defined, not platform specific. So the __APPLE__ goes back to
> SIGPOLL (even if it is always defined you'll never receive
> it on
> platforms that don't use it). The others should check
> SI_EMPTY etc.
>
> Note that there are a number of other possibilities just on
> Solaris:
>
> https://docs.oracle.com/cd/E23823_01/html/816-5173/siginfo-3head.html
>
> That said we don't need to give this much detail in my
> opinion. If
> you get a si_code of SI_LWP for example then you can go an
> look at
> the manpage to find out exactly what it means - we don't
> want to be
> maintaining that information here. So I'm not sure we really
> need/want this expanded level of detail.
>
>
> Okay. I think we (SAP) do not care so much about the size of the
> hs-err
> file and trade a bit more verbosity in return for giving more
> detailled
> information to our support people. But I can see your point, and
> it is a
> matter of preferences.
>
> In the new version, I kept the output of os::print_siginfo very
> compact
> and on one line. We print si_signo, si_code unconditionally,
> si_errno if
> set, and then decide depending on the code whether to print
> si_addr or
> si_pid/si_uid or si_band.
>
>
> Note that there was on all platforms code which
> examines the
> crash address
> for a possible CDS fault and prints a specialized hint.
> In my
> opinion that
> makes no sense in a general os::print_siginfo()
> function and I
> moved it to
> a separate error reporting step into vmError.cpp. This
> may be a
> point of
> contention because the way to extract the fault address
> from a
> siginfo/EXCEPTION_RECORD structure is platform
> dependend, but
> imho this is
> still the least ugly solution.
>
>
> Sorry I don't like seeing the _WIN32 condition in the
> vmError.cpp
> code. By keeping this as part of the print_siginfo the platform
> specific aspect is handled transparently. As something that
> is only
> used as part of error reporting, and which is directly
> related to a
> signal being received the current placement doesn't seem
> unreasonable. If you really want to factor this out then I
> would add
> os::check_for_failing_cds_access() or something like that.
>
>
> I really did not like this in os::print_siginfo(). One does not
> expect
> os::print_siginfo() to know about cds specialities, and to know
> that it
> only gets called in the context of error handling (who says this is
> always true?)
>
> But I agree the platform dependend code in vmError.cpp is ugly.
> I moved
> it to vmError_posix.cpp and vmError_windows.cpp respectively, and
> renamed it per your suggestion. I did not want to add this to
> the os::
> namespace because it is really a highly specialized function
> which only
> makes sense in error handling.
>
> Thanks & Regards, Thomas
>
> Thanks,
> David
> -----
>
> Kind regards, Thomas
>
>
>
More information about the hotspot-runtime-dev
mailing list