RFR: 8259392: Zero error reporting is broken after JDK-8255711

David Holmes dholmes at openjdk.java.net
Fri Jan 8 07:16:56 UTC 2021


On Thu, 7 Jan 2021 18:22:46 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> This manifests on the following `tier1` tests with Linux x86_64 Zero:
> 
> runtime/ErrorHandling/ErrorFileRedirectTest.java
> runtime/ErrorHandling/SecondaryErrorTest.java
> runtime/memory/ReadFromNoaccessArea.java
> runtime/Unsafe/InternalErrorTest.java
> runtime/Safepoint/TestAbortVMOnSafepointTimeout.java
> 
> 00:17:25 #  Internal Error (/home/shade/trunks/jdk/src/hotspot/os_cpu/linux_zero/os_linux_zero.cpp:94), pid=739632, tid=739633
> 00:17:25 #  Error: ShouldNotCall()
> 
> address os::Posix::ucontext_get_pc(const ucontext_t* uc) {
>   ShouldNotCallThis(); <---- crash here
>   return NULL; // silence compile warnings
> }
> 
> I believe the generification in JDK-8255711 applies to Zero awkwardly. 
> 
> Zero is awkward in the sense it is too generic for its own good. It does not have any access to crash context decoders, and that is why `ucontext_*` parsers are `ShouldNotCallThis()`-ed. Before JDK-8255711, Zero error reporting code was specially crafted to avoid this, apparently.
> 
> There are at least two problems:
>  1. `ucontext_get_pc` in unimplemented, so we can special-case those for Zero. Instead of returning a bogus value from Zero implementation, I decided to just special-case at its critical use in error reporting.
>  2. generic `VMError::report_and_die` circles back at Zero's unimplemented `os::fetch_frame_from_context`. Before JDK-8255711, Zero did `fatal()` that avoided this trouble. The patch ignores the context to match that behavior.
> 
> While the regression starts at JDK 16, it affects the path when VM is already crashing, so should not affect product quality per se. Therefore, I would prefer to get it to JDK 17 for some testing, and then maybe consider it for 16.0.{1,2}.
> 
> Also, this changeset kills the cat.

Hi Aleksey,

Approval in principle but it is a bit too verbose for my liking - suggestions below.

Thanks,
David

src/hotspot/os/posix/signals_posix.cpp line 627:

> 625:     //     reporting code asks e.g. about frames on stack, Zero would experience
> 626:     //     a secondary ShouldNotCallThis() crash.
> 627:     VMError::report_and_die(t, sig, pc, info, NULL);

Can't we just use

`ZERO_ONLY(NULL) NOT_ZERO(ucVoid)

? Or perhaps even:

`ZERO_ONLY(ucVoid = NULL;)

?

-------------

Changes requested by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1980


More information about the hotspot-runtime-dev mailing list