RFR: JDK-8296907: VMError: add optional callstacks, siginfo for secondary errors [v3]
Axel Boldt-Christmas
aboldtch at openjdk.org
Wed Nov 16 17:44:07 UTC 2022
On Mon, 14 Nov 2022 07:29:25 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>> This was motivated by discussions we had in https://github.com/openjdk/jdk/pull/11017.
>>
>> To aid in analyzing secondary errors during error reporting, it would be useful to see their callstacks for secondary errors. But printing callstacks during error reporting is unsafe - if we get a second crash or assert, it will cause infinite recursion and interrupt error reporting. Also, the hs-err file would be quite verbose. Therefore this feature is optional and limited to debug builds.
>>
>> ---
>>
>> Patch
>>
>> - adds optional callstack/siginfo printing via debug-only switch `-XX:+ErrorLogSecondaryErrorDetails`.
>> - fixes a bug in secondary error handling where we would use the global scratch buffer recursively (via stringStream); that could lead to confusing output since it is used by the error log stream already. We can print directly to that one instead.
>> - Removed a stray newline from print_native_stack to clean output.
>> - added regression testing for this feature. I removed my name from the test since we don't do this anymore.
>> - added clarifying comments to the test and code
>> - added SAP copyright to the regression test (we introduced it years ago for JDK-8065895)
>>
>> Output looks like this:
>>
>>
>> $ java ... -XX:+ErrorLogSecondaryErrorDetails
>>
>>
>> will produce, for secondary errors, siginfo and call stack.
>>
>>
>> [error occurred during error reporting (test secondary crash 1), id 0xb, SIGSEGV (0xb) at pc=0x00007fddfe8a0a61]
>> [siginfo: si_signo: 11 (SIGSEGV), si_code: 128 (SI_KERNEL), si_addr: 0x0000000000000000]
>> [stack: Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
>> V [libjvm.so+0x1ceea61] VMError::controlled_crash(int)+0x241 (vmError.cpp:1946)
>> V [libjvm.so+0x1cf413f] VMError::report(outputStream*, bool)+0x46bf (vmError.cpp:564)
>> V [libjvm.so+0x1cf516b] VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x19b (vmError.cpp:1709)
>> V [libjvm.so+0x1cf5e8f] VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*, char const*, ...)+0x8f (vmError.cpp:1467)
>> V [libjvm.so+0x1cf5ec2] VMError::report_and_die(Thread*, unsigned int, unsigned char*, void*, void*)+0x22 (vmError.cpp:1473)
>> V [libjvm.so+0x1a549e7] JVM_handle_linux_signal+0x1f7 (signals_posix.cpp:656)
>> C [libc.so.6+0x43090]
>> V [libjvm.so+0x11d6965] JNI_CreateJavaVM+0x5b5 (jni.cpp:3662)
>> C [libjli.so+0x4013] JavaMain+0x93 (java.c:1457)
>> C [libjli.so+0x800d] ThreadJavaMain+0xd (java_md.c:650)
>> ]
>
> Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision:
>
> - Merge branch 'JDK-8296907-VMError-add-optional-callstacks-siginfo-for-secondary-errors' of github.com:tstuefe/jdk into JDK-8296907-VMError-add-optional-callstacks-siginfo-for-secondary-errors
> - Feedback David
Always nice with more, simple and unobtrusive development tooling.
Just have a couple of comments/questions?
src/hotspot/share/utilities/vmError.cpp line 1635:
> 1633: // Any information (signal, context, siginfo etc) printed here should use the function
> 1634: // arguments, not the information stored in *this, since those describe the primary crash.
> 1635: char tmp[256]; // cannot use global scratch buffer
Is there any problems with making this static? Given that we care about the stack depth when we have repeated crashes.
src/hotspot/share/utilities/vmError.cpp line 1641:
> 1639: _current_step_info, id);
> 1640: if (os::exception_name(id, tmp, sizeof(tmp))) {
> 1641: st->print(", %s (0x%x) at pc=" PTR_FORMAT, tmp, id, p2i(pc));
Not really relevant for this PR. But do not like the inconsistency that the id is printed as hex here, decimal in os::print_siginfo for posix, and hex in os::print_siginfo for windows.
src/hotspot/share/utilities/vmError.cpp line 1653:
> 1651: st->print_cr("]");
> 1652: #ifdef ASSERT
> 1653: if (ErrorLogSecondaryErrorDetails) {
> But printing callstacks during error reporting is unsafe - if we get a second crash or assert, it will cause infinite recursion and interrupt error reporting.
Should there be logic here that at least guarantee progress?
```c++
if (ErrorLogSecondaryErrorDetails && !_some_bool) {
_some_bool = true;
[...]
}
_some_bool = false;
-------------
PR: https://git.openjdk.org/jdk/pull/11118
More information about the hotspot-dev
mailing list