RFR: 8296469: Instrument VMError::report with reentrant iteration step for register and stack printing [v11]

Thomas Stuefe stuefe at openjdk.org
Mon May 15 11:25:53 UTC 2023


On Mon, 15 May 2023 07:22:03 GMT, Axel Boldt-Christmas <aboldtch at openjdk.org> wrote:

>> Add reentrant step logic to VMError::report with an inner loop which enable the logic to recover at every step of the iteration.
>> 
>> Before this change, if printing one register/stack position crashes then no more registers/stack positions will be printed.
>> 
>> After this change even if the VM is unstable and some registers print_location crashes the hs_err printing will recover and keep attempting to print the rest of the registers or stack values.
>> 
>> Enables the following
>> ```C++
>> REENTRANT_STEP_IF("printing register info", _verbose && _context && _thread && Universe::is_fully_initialized())
>>   os::print_register_info_header(st, _context);
>> 
>>   REENTRANT_LOOP_START(os::print_nth_register_info_max_index())
>>     // decode register contents if possible
>>     ResourceMark rm(_thread);
>>     os::print_nth_register_info(st, REENTRANT_ITERATION_STEP, _context);
>>   REENTRANT_LOOP_END
>> 
>>   st->cr();
>> 
>> 
>> Testing: tier 1 and compiled Linux-x64/aarch64, MacOS-x64/aarch64, Windows x64 and cross-compiled Linux-x86/riscv/arm/ppc/s390x (GHA and some local)
>
> Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Rename and invert should_stop_reattempt_step

Generally good, small nits remain.

src/hotspot/os_cpu/aix_ppc/os_aix_ppc.cpp line 480:

> 478:   int n = continuation;
> 479:   if (context == nullptr || n < 0 || n >= register_count) {
> 480:     return;

Here and the other variants: Under which circumstances would n<0 be acceptable? I would assert. Arguably also for context=nullptr, but up to you.

src/hotspot/os_cpu/linux_riscv/os_linux_riscv.cpp line 385:

> 383:     // Update continuation with next index before printing location
> 384:     continuation = n + 1;
> 385:     st->print("%-*.*s=", 8, 8, reg_abi_names[n]);

Nitpicking, preexisting: while you are here, you could probably simplify to "%-8.8s"

src/hotspot/os_cpu/linux_s390/os_linux_s390.cpp line 478:

> 476:     } else {
> 477:       st->print("r%-2d=", n-1);
> 478:       print_location(st, uc->uc_mcontext.gregs[n-1]);

The "-1" here and the "-3" for ppc makes me a bit nervous, since if this bitrots, we probably never notice that we print the wrong registers unless someone debugs. Pragmatic proposal: Print the general regs first, then the special ones, then you don't need to deal with hard-coded offsets. I don't think anyone would object to that.

src/hotspot/os_cpu/windows_aarch64/os_windows_aarch64.cpp line 239:

> 237:     continuation = n + 1;
> 238: # define CASE_PRINT_REG(n, str, id) case n: st->print(str); print_location(st, uc->id);
> 239:     switch (n) {

here and other places: if you wanted, you could get rid of the first argument to CASE_PRINT_REG by using a second int variable running alongside n.

Up to you though, this is also fine as it is.

src/hotspot/share/utilities/vmError.cpp line 510:

> 508:       st->print("stack at sp + %d slots: ", i);
> 509:       os::print_location(st, *(slot));
> 510:     }

Can you please print a note for unreadable slots too?

src/hotspot/share/utilities/vmError.cpp line 637:

> 635:     _current_step = __LINE__;                              \
> 636:     _current_step_info = s;                                \
> 637:     record_step_start_time();                              \

Preexisting: mover record_step_start_time and _step_did_timeout into condition?

-------------

PR Review: https://git.openjdk.org/jdk/pull/11017#pullrequestreview-1426248331
PR Review Comment: https://git.openjdk.org/jdk/pull/11017#discussion_r1193641996
PR Review Comment: https://git.openjdk.org/jdk/pull/11017#discussion_r1193651650
PR Review Comment: https://git.openjdk.org/jdk/pull/11017#discussion_r1193680360
PR Review Comment: https://git.openjdk.org/jdk/pull/11017#discussion_r1193682704
PR Review Comment: https://git.openjdk.org/jdk/pull/11017#discussion_r1193690992
PR Review Comment: https://git.openjdk.org/jdk/pull/11017#discussion_r1193695717


More information about the hotspot-dev mailing list