RFR: JDK-8185734: [Windows] Structured Exception Catcher missing around gtest execution
Magnus Ihse Bursie
ihse at openjdk.java.net
Tue Dec 15 08:23:55 UTC 2020
On Sun, 13 Dec 2020 11:07:42 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
> Hi,
>
> May I have reviews please for the following patch.
>
> At the moment, if a crash happens on Windows in gtests, the gtest SEH handler may be invoked instead of our error handler, and we just see a one-line-warning "SEH happened blabala". No hs-err file.
>
> Even worse, if a crash happens inside the VM as part of the gtests, if our SEH handler does not get involved, this may interfere with VM functionality - e.g. SafeFetch.
>
> Whether or not our SEH handler gets involved currently depends on arbitrary factors:
> - whether the fault happens in a VM or Java thread - which have a __try/__except around their start function - or whether the fault happens directly in the thread running the test
> - Faults in generated code are not handled on x86 but are okay x64 (where a SEH handler is registered for the code cache region) or on aarch64 (which uses VEH).
>
> This patch consists of two parts
>
> A) It surrounds the gtestlauncher main function with a SEH catcher. For that to work I also need to export the SEH handler from the hotspot. Note: It is difficult to place the SEH catcher: SEH is mutually exclusive with C++ exceptions, and since googletest uses C++ exceptions to communicate test conditions, the only place to put those __try/__except is really up here, at the entry of the gtestlauncher main() function.
>
> B) This is unfortunately not sufficient since googletest uses its own SEH catcher to wrap each test (see gtest.cc). Since that catcher sits below our catcher on the stack, it superimposes ours.
>
> In JBS, @kimbarrett suggested to build gtests with GTEST_HAS_SEH switched off to prevent gtest from using SEH. Unfortunately that won't work since the use of death tests means we need SEH. If we switch GTEST_HAS_SEH off, the death tests don't build. I also do not like this suggestion since this configuration may have a higher chance of bitrotting upstream.
>
> The solution I found is to switch off exception catching from user code as described in [3] using `--gtest_catch_exceptions=0` or the environment variable `GTEST_CATCH_EXCEPTIONS=0`. Since we do not use C++ exceptions in the hotspot, this is fine.
>
> The only drawback is that this cannot be done from within the gtestlauncher itself. Setting the environment variable in the main function - or even during dynamic initialization - does not work because the gtestlauncher itself parses all arguments as part of dynamic initialization. So I did the next best thing and specified `--gtest_catch_exceptions=0` at the places where we run the gtests. This is not perfect, but better than nothing.
>
> Testing: manually on Windows x64, x86, GH actions (Linux errors seem unrelated to this patch).
>
> ----
>
> Notes:
> - If we owned the googletest code - forked it off like we did before - (B) would be very simple to solve by changing the default for `gtest_catch_exceptions` to 1. I still believe maintaining a fork of googletest would have many benefits.
>
> - Using VEH would have saved us from using __try/__except here, so we would have not needed (A). ATM we use VEH on aarch, SEH on x64+x32. Uniformly switching to VEH has been discussed several times in the past, the last attempt has been by @luhenry (see [1], [2]), but this has not yet materialized.
>
> [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-June/040228.html
> [2] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-August/040967.html
> [3] https://github.com/google/googletest/blob/master/googletest/docs/advanced.md#disabling-catching-test-thrown-exceptions
Build changes look good.
I left a note on the hotspot code; feel free to ignore it if you want. :)
I agree with your assessment that restoring gtest to the code base would make life simpler. I'm not sure what needs to be done to do that; I suppose it needs someone to write and implement a JBS feature, and to publicly acknowledge that they intent to spend the time needed to keep it up to date with upstream going forward.
(I also agree with your assessment on VEH replacing SEH, as my recollection of working with the runtime system on JRockit was that VEH was superior for the needs of a VM, and much closer to the posix signal model.)
-------------
Marked as reviewed by ihse (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/1757
More information about the build-dev
mailing list