RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting

Thomas Stuefe stuefe at openjdk.org
Thu Feb 23 08:11:06 UTC 2023


On Mon, 20 Feb 2023 08:28:05 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to the implementation of the Windows-specific option
> UseOSErrorReporting, toward allowing crash reporting functions to be declared
> noreturn.  VMError::report_and_die no longer conditionally returns if the
> Windows-only option UseOSErrorReporting is true.
> 
> The core of VM crash reporting has been renamed and privatized, and now takes
> an additional argument indicating whether the caller just wants reporting or
> also wants termination after reporting.  The Windows top-level and unhandled
> exception filters call into that to request reporting, and also termination if
> UseOSErrorReporting is false, otherwise returning from the call and taking
> further action.  All other callers of VM crash reporting include termination.
> 
> So if we get a structured exception on Windows and the option is true then we
> report it and then end up walking up the exception handlers until running out,
> when Windows crash behavior gets a shot.  This is essentially the same as
> before the change, just checking the option in a different place (in the
> Windows-specific caller, rather than in an #ifdef block in the shared code).
> 
> In addition, the core VM crash reporter uses BREAKPOINT before termination if 
> UseOSErrorReporting is true.  So if we have an assertion failure we report it
> and then end up walking up the exception handlers (because of the breakpoint
> exception) until running out, when Windows crash behavior gets a shot.
> 
> So if there is an attached debugger (or JIT debugging is enabled), this leads
> an assertion failure to hit the debugger inside the crash reporter rather than
> after it returned (due to the old behavior for UseOSErrorReporting) from the
> failed assertion (hitting the BREAKPOINT there).
> 
> Note that on Windows, ::breakpoint() calls DebugBreak(), raising a breakpoint
> exception, which ends up being unhandled if UseOSErrorReporting is true.
> 
> There is a pre-existing bug that I'll be reporting separately.  If
> UseOSErrorReporting and CreateCoredumpOnCrash are both true, we create an
> empty .mdmp file.  We shouldn't create that file when UseOSErrorReporting.
> 
> Testing:
> mach5 tier1-3
> 
> Manual testing with the following, to verify expected behavior that is the
> same as before the change.
> 
> # -XX:ErrorHandlerTest=N
> # 1: assertion failure
> # 2: guarantee failure
> # 14: SIGSEGV
> # 15: divide by zero
> path/to/bin/java \
>   -XX:+UnlockDiagnosticVMOptions \
>   -XX:+ErrorLogSecondaryErrorDetails \
>   -XX:+UseOSErrorReporting \
>   -XX:ErrorHandlerTest=1 \
>   TestDebug.java
> 
> --- TestDebug.java ---
> import java.lang.String;
> public class TestDebug {
>     static private volatile String dummy;
>     public static void main(String[] args) throws Exception {
>         while (true) {
>             dummy = new String("foo bar");
>         }
>     }
> }
> --- end TestDebug.java ---
> 
> I need someone with a better understanding of Windows to test the interaction
> with Windows Error Reporting (WER).

Looking through this further.

Side note: I've disliked this thicket of report() functions and their wall of arguments for a long time. I think a very simple way to make this easier to read would be to capture argument groups in structures.

For example, anything crash related could go into


struct crash_info_t { signal, siginfo, context, crash_address }

maybe even as a union with Posix/Windows flavour and native types (DWORD vs int for signal, EXCEPTION_INFO vs siginfo, CONTEXT vs ucontext).

Infos for oom-errors could go into a similar structure like


struct oom_info_t { size, failing_api, filename, lineno }

etc.

src/hotspot/share/utilities/vmError.cpp line 1374:

> 1372:                                    Thread* thread, unsigned int sig, address pc,
> 1373:                                    void* siginfo, void* context,
> 1374:                                    const char* detail_fmt, ...) {

Do we really need this variant? AFAICS this is only used in one place, in vmError_windows.cpp, from the secondary crash handler. And there, we don't really need the detail format; all you feed it is an empty string.

-------------

PR: https://git.openjdk.org/jdk/pull/12651


More information about the hotspot-dev mailing list