RFR: 8252148: vmError::controlled_crash should be #ifdef ASSERT

Thomas Stuefe stuefe at openjdk.java.net
Fri Dec 11 08:32:00 UTC 2020


On Wed, 9 Dec 2020 21:47:31 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

> Moved some unit tests from vmError into a gtest and moved controlled_crash code under #ifdef ASSERT rather than #ifndef PRODUCT since the tests that use this are @requires = vm.debug.  This change keeps the ErrorHandlerTest=n option and other handler test options, because there are some tests that use this more but moves them to develop options (they were notproduct).  There are some ErrorHandlerTest=n tests that do more thorough hs_err_pid.log file verification that remain.
> 
> Tested with tier1-3 with product and fastdebug builds.  Built with optimized.
> 
> This is for JDK 17.

Hi Coleen,

this is a good cleanup, I like this a lot.

About moving ErrorHandlerTest.java to gtest: this is a slick way of doing these tests. However, some things to consider:
- Strictly speaking we now do not test the real error handler output but the artificial output `ExecutingUnitTests` gives us. The real printing code is now untested. This may have bitten you right here, since I believe in `print_error_for_unit_test` you miss a `va_copy` which should have messed up detail_args and garbled the real output later on.
- moving these tests out of jtreg means we give up flexibility (excluding them or moving them to another tier). Do they impact the gtest run time noticeably?

There are also remaining problems with gtest death tests, which I hope we can fix at some point. I don't hold my breath though since we decided to not own the gtest code and cannot make modifications:
- Death tests are easily confused by stderr output. See: https://bugs.openjdk.java.net/browse/JDK-8257229. So whenever we have unconditional stderr output in the VM (eg if  some large page alloc failed) these death tests fail too.
- Death tests don't tell you the assert message which happened if it did not match their expectations. Thats plain annoying. See https://bugs.openjdk.java.net/browse/JDK-8257227.

src/hotspot/share/utilities/vmError.cpp line 1800:

> 1798:     case 2: guarantee(how == 0, "test guarantee"); break;
> 1799: 
> 1800:     // The other cases are unused.

I like that you cut down on the cases. Most of them were obsolete or not working correctly. 

How about compacting the cases? And maybe use a real enum. But this could also be done in a separate RFE.

src/hotspot/share/utilities/vmError.hpp line 187:

> 185: 
> 186:   // Test vmassert(), fatal(), guarantee(), etc.
> 187:   DEBUG_ONLY(static void test_error_handler();)

May I suggest:
- remove the almost empty `test_error_handler` completely
- remove the "case 0" in `controlled_crash` 
- leave `controlled_crash` public
- In create_vm, doing this instead:
if (ErrorHandlerTest != 0) {
  VMError::controlled_crash(ErrorHandlerTest);
}

This would have two advantages, it would be clearer and leave us with a "controlled_crash" API to use in quick tests (I do that sometimes). 

Both are not hard reasons and a matter of taste, so feel free to ignore my proposal.

test/hotspot/gtest/unittest.hpp line 138:

> 136:     ASSERT_EXIT(child_ ## category ## _ ## name ## _(),             \
> 137:                 ::testing::ExitedWithCode(1),                       \
> 138:                 "^assert failed: .*" msg);                          \

Good!

src/hotspot/share/utilities/debug.cpp line 246:

> 244: #ifdef ASSERT
> 245:   if (ExecutingUnitTests) {
> 246:     if (detail_fmt != NULL) {

You miss va_copy for detail_args here.

src/hotspot/share/utilities/debug.cpp line 243:

> 241: static char _detail_msg[1024];
> 242: 
> 243: static void print_error_for_unit_test(const char* message, const char* detail_fmt, va_list detail_args) {

Trying to understand why you changed this code (beside factoring it out). We have:

- report_vm_out_of_memory -> message=NULL, detail set
- report_and_die from real crash -> message=NULL, detail=""
- report_fatal -> message="fatal error", detail set
- report_vm_error (assert, guarantee) -> message="assert (bla) failed", detail is set

so sometimes message is NULL, sometimes details are empty. What a mess :( Since you do gtests for all these cases now, you need to print the assert message correctly for them. You use _detail_msg as scratch buffer, since its already there. Later we print the detail message again into it. But I think that is fine.

I have a patch on the back burner somewhere simplifying these functions. Maybe we can improve matters. Then this function becomes easier too.

-------------

Changes requested by stuefe (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1723


More information about the hotspot-dev mailing list