RFR: 8302798: Refactor -XX:+UseOSErrorReporting for noreturn crash reporting
Thomas Stuefe
stuefe at openjdk.org
Thu Feb 23 08:11:06 UTC 2023
On Thu, 23 Feb 2023 03:57:18 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:
> > Okay, I see. But does it have to come with hs-err file too? Because that is the only reason for this complication. And hs-err files probably spoil the context reported to Watson, see below.
>
> The question of whether to create an hs_err log when +UseOSErrorReporting is out of scope for this change. The decision to do so is very old, going back to these:
>
> https://bugs.openjdk.org/browse/JDK-4997835 RFE: crash dump will only be created when running w/ -XX:+ShowMessageBoxOnError
>
> https://bugs.openjdk.org/browse/JDK-6227246 Improve Windows unhandled structured exception reporting
>
> https://bugs.openjdk.org/browse/JDK-6794885 Improve Windows unhandled structured exception reporting - UseOSErrorReporting
>
> Feel free to raise that as a separate issue.
>
Thanks for digging out this history, it has been interesting reading.
> > I am a bit unhappy with this patch because it "sanctifies" returning from hs-err reporting. We had long discussions in the recent past where we needed to convince developers that returning from hs-err reporting is a very bad idea. I'm afraid with your patch, it won't be long before follow-up patches build upon your ARA and re-introduce this question.
> > I'd prefer if UseOSErrorReporting, and the code surrounding the windows exceptionhandler and CONTINUE_SEARCH, remained Windows only, and restricted to this one use case: jumping back to OS error reporting in case of an error.
>
> I don't think it santifies anything of the sort. I agree that returning from hs-err reporting is problematic. The current +UseOSErrorReporting was only one of several mechanisms doing exactly that. I've already removed SuppressErrorAt. I'm working on another change that modifies how the existing Debugging mechanism works.
>
> I've included comments about how the ARA stuff and xxx_maybe_die_xxx exist only to support +UseOSErrorReporting. I could go further and add something like the following at the beginning of VMError::report_and_maybe_die_impl:
>
> ```
> assert(after_report_action == AfterReportAction::Die
> WINDOWS_ONLY(|| UseOSErrorReporting),
> "must terminate");
> ```
>
> The reason for the new argument is so that the caller, which is where the knowlege of whether return or termination is desired, can request what it wants. I could have further uglified the code by making ARA::Return Windows-only, with more #ifdefs. I'd kind of prefer not to do that.
>
Okay, I think above assert would be a reasonable compromise; thanks for addressing this concern.
> > Your patch opens up other questions too, e.g. how is this supposed to work in the face of secondary errors? IIUC the last invocation of the exception handler would be the winning one and return its error state back to the top-level exception filter, so the error reported by Watson would be the secondary crash, not the real crash. I think this would happen today too, or?
>
> I think that's a correct summary, with or without this change.
>
> > This may be an argument for skipping hs-err reporting with UseOSErrorReporting - skipping everything, actually: the context that gets reported to Watson would be guaranteed to be the one from the real crash point. And the crash would be "fresh", as close as possible to the real crash site. I'd think that this is how UseOSErrorReporting is supposed to work, leaving crash reporting up to the OS.
>
> That would be an appropriate argument to make as part of a proposal to remove the hs-err logging when +UseOSErrorReporting.
Okay, we can address it separately.
-------------
PR: https://git.openjdk.org/jdk/pull/12651
More information about the hotspot-dev
mailing list