RFR: JDK-8293402: hs-err file printer should reattempt stack trace printing if it fails

Christian Hagedorn chagedorn at openjdk.org
Tue Sep 6 17:24:02 UTC 2022


On Tue, 6 Sep 2022 08:05:23 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> Hi,
> 
> may I have reviews for this small improvement.
> 
> The call stack may be the most important part of an hs-err file. We recently introduced printing of source information (https://bugs.openjdk.org/browse/JDK-8242181) which is nice but makes stack printing more vulnerable for two reasons:
> - we may crash due to a programmer error (e.g. https://bugs.openjdk.org/browse/JDK-8293344)
> - we may timeout on very slow machines/file systems when the source information are parsed from the debug info (we have seen those problems in the past)
> 
> Therefore, VMError should retry stack printing without source information if the first attempt to print failed.
> 
> Examples:
> 
> Step timeouts while retrieving source info:
> 
> 
>  24 ---------------  T H R E A D  ---------------
>  25 
>  26 Current thread (0x00007f70ac028bd0):  JavaThread "main" [_thread_in_vm, id=565259, stack(0x00007f70b0587000,0x00007f70b0688000)]
>  27 
>  28 Stack: [0x00007f70b0587000,0x00007f70b0688000],  sp=0x00007f70b0686cf0,  free space=1023k
>  29 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
>  30 V  [libjvm.so+0x1cd41c1]  VMError::controlled_crash(int)+0x241
>  31 [timeout occurred during error reporting in step "printing native stack (with source info)"] after 30 s.
>  32 
>  33 Retrying call stack printing without source information...
>  34 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
>  35 V  [libjvm.so+0x1cd41c1]  VMError::controlled_crash(int)+0x241
>  36 V  [libjvm.so+0x11cbe45]  JNI_CreateJavaVM+0x5b5
>  37 C  [libjli.so+0x4013]  JavaMain+0x93
>  38 C  [libjli.so+0x800d]  ThreadJavaMain+0xd
>  39 
> 
> 
> 
> Step crashes while retrieving source info:
> 
> 
>  24 ---------------  T H R E A D  ---------------
>  25 
>  26 Current thread (0x00007fc000028bd0):  JavaThread "main" [_thread_in_vm, id=569254, stack(0x00007fc00573c000,0x00007fc00583d000)]
>  27 
>  28 Stack: [0x00007fc00573c000,0x00007fc00583d000],  sp=0x00007fc00583bcf0,  free space=1023k
>  29 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
>  30 V  [libjvm.so+0x1cd41e1]  VMError::controlled_crash(int)+0x241
>  31 [error occurred during error reporting (printing native stack (with source info)), id 0xb, SIGSEGV (0xb) at pc=0x00007fc006694d78]
>  32 
>  33 
>  34 Retrying call stack printing without source information...
>  35 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
>  36 V  [libjvm.so+0x1cd41e1]  VMError::controlled_crash(int)+0x241
>  37 V  [libjvm.so+0x11cbe65]  JNI_CreateJavaVM+0x5b5
>  38 C  [libjli.so+0x4013]  JavaMain+0x93
>  39 C  [libjli.so+0x800d]  ThreadJavaMain+0xd
> 
> 
> 
> Thanks, Thomas

That looks good to me! Thanks for following up with this RFE after proposing it in the PR of JDK-8242181. I think it is very beneficial to have this safety net for the reasons you've stated - also in regard to extending the parser to support DWARF 5 (or older versions) at some point in the future. I fully agree that a missing stack trace due to a crash/timeout is one of the worst things that could happen when reporting an error.

-------------

Marked as reviewed by chagedorn (Reviewer).

PR: https://git.openjdk.org/jdk/pull/10179


More information about the hotspot-dev mailing list