RFR: JDK-8293402: hs-err file printer should reattempt stack trace printing if it fails
Christian Hagedorn
chagedorn at openjdk.org
Tue Sep 6 17:24:02 UTC 2022
On Tue, 6 Sep 2022 08:05:23 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
> Hi,
>
> may I have reviews for this small improvement.
>
> The call stack may be the most important part of an hs-err file. We recently introduced printing of source information (https://bugs.openjdk.org/browse/JDK-8242181) which is nice but makes stack printing more vulnerable for two reasons:
> - we may crash due to a programmer error (e.g. https://bugs.openjdk.org/browse/JDK-8293344)
> - we may timeout on very slow machines/file systems when the source information are parsed from the debug info (we have seen those problems in the past)
>
> Therefore, VMError should retry stack printing without source information if the first attempt to print failed.
>
> Examples:
>
> Step timeouts while retrieving source info:
>
>
> 24 --------------- T H R E A D ---------------
> 25
> 26 Current thread (0x00007f70ac028bd0): JavaThread "main" [_thread_in_vm, id=565259, stack(0x00007f70b0587000,0x00007f70b0688000)]
> 27
> 28 Stack: [0x00007f70b0587000,0x00007f70b0688000], sp=0x00007f70b0686cf0, free space=1023k
> 29 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> 30 V [libjvm.so+0x1cd41c1] VMError::controlled_crash(int)+0x241
> 31 [timeout occurred during error reporting in step "printing native stack (with source info)"] after 30 s.
> 32
> 33 Retrying call stack printing without source information...
> 34 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> 35 V [libjvm.so+0x1cd41c1] VMError::controlled_crash(int)+0x241
> 36 V [libjvm.so+0x11cbe45] JNI_CreateJavaVM+0x5b5
> 37 C [libjli.so+0x4013] JavaMain+0x93
> 38 C [libjli.so+0x800d] ThreadJavaMain+0xd
> 39
>
>
>
> Step crashes while retrieving source info:
>
>
> 24 --------------- T H R E A D ---------------
> 25
> 26 Current thread (0x00007fc000028bd0): JavaThread "main" [_thread_in_vm, id=569254, stack(0x00007fc00573c000,0x00007fc00583d000)]
> 27
> 28 Stack: [0x00007fc00573c000,0x00007fc00583d000], sp=0x00007fc00583bcf0, free space=1023k
> 29 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> 30 V [libjvm.so+0x1cd41e1] VMError::controlled_crash(int)+0x241
> 31 [error occurred during error reporting (printing native stack (with source info)), id 0xb, SIGSEGV (0xb) at pc=0x00007fc006694d78]
> 32
> 33
> 34 Retrying call stack printing without source information...
> 35 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> 36 V [libjvm.so+0x1cd41e1] VMError::controlled_crash(int)+0x241
> 37 V [libjvm.so+0x11cbe65] JNI_CreateJavaVM+0x5b5
> 38 C [libjli.so+0x4013] JavaMain+0x93
> 39 C [libjli.so+0x800d] ThreadJavaMain+0xd
>
>
>
> Thanks, Thomas
That looks good to me! Thanks for following up with this RFE after proposing it in the PR of JDK-8242181. I think it is very beneficial to have this safety net for the reasons you've stated - also in regard to extending the parser to support DWARF 5 (or older versions) at some point in the future. I fully agree that a missing stack trace due to a crash/timeout is one of the worst things that could happen when reporting an error.
-------------
Marked as reviewed by chagedorn (Reviewer).
PR: https://git.openjdk.org/jdk/pull/10179
More information about the hotspot-dev
mailing list