RFR: 8259392: Zero error reporting is broken after JDK-8255711
David Holmes
dholmes at openjdk.java.net
Fri Jan 8 07:16:56 UTC 2021
On Thu, 7 Jan 2021 18:22:46 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
> This manifests on the following `tier1` tests with Linux x86_64 Zero:
>
> runtime/ErrorHandling/ErrorFileRedirectTest.java
> runtime/ErrorHandling/SecondaryErrorTest.java
> runtime/memory/ReadFromNoaccessArea.java
> runtime/Unsafe/InternalErrorTest.java
> runtime/Safepoint/TestAbortVMOnSafepointTimeout.java
>
> 00:17:25 # Internal Error (/home/shade/trunks/jdk/src/hotspot/os_cpu/linux_zero/os_linux_zero.cpp:94), pid=739632, tid=739633
> 00:17:25 # Error: ShouldNotCall()
>
> address os::Posix::ucontext_get_pc(const ucontext_t* uc) {
> ShouldNotCallThis(); <---- crash here
> return NULL; // silence compile warnings
> }
>
> I believe the generification in JDK-8255711 applies to Zero awkwardly.
>
> Zero is awkward in the sense it is too generic for its own good. It does not have any access to crash context decoders, and that is why `ucontext_*` parsers are `ShouldNotCallThis()`-ed. Before JDK-8255711, Zero error reporting code was specially crafted to avoid this, apparently.
>
> There are at least two problems:
> 1. `ucontext_get_pc` in unimplemented, so we can special-case those for Zero. Instead of returning a bogus value from Zero implementation, I decided to just special-case at its critical use in error reporting.
> 2. generic `VMError::report_and_die` circles back at Zero's unimplemented `os::fetch_frame_from_context`. Before JDK-8255711, Zero did `fatal()` that avoided this trouble. The patch ignores the context to match that behavior.
>
> While the regression starts at JDK 16, it affects the path when VM is already crashing, so should not affect product quality per se. Therefore, I would prefer to get it to JDK 17 for some testing, and then maybe consider it for 16.0.{1,2}.
>
> Also, this changeset kills the cat.
Hi Aleksey,
Approval in principle but it is a bit too verbose for my liking - suggestions below.
Thanks,
David
src/hotspot/os/posix/signals_posix.cpp line 627:
> 625: // reporting code asks e.g. about frames on stack, Zero would experience
> 626: // a secondary ShouldNotCallThis() crash.
> 627: VMError::report_and_die(t, sig, pc, info, NULL);
Can't we just use
`ZERO_ONLY(NULL) NOT_ZERO(ucVoid)
? Or perhaps even:
`ZERO_ONLY(ucVoid = NULL;)
?
-------------
Changes requested by dholmes (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/1980
More information about the hotspot-runtime-dev
mailing list