RFR: 8296469: Instrument VMError::report with reentrant iteration step for register and stack printing [v9]

Axel Boldt-Christmas aboldtch at openjdk.org
Mon May 15 07:15:54 UTC 2023


On Fri, 5 May 2023 14:26:37 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits:
>> 
>>  - Merge remote-tracking branch 'upstream_jdk/master' into vmerror_report_register_stack_reentrant
>>  - Add test
>>  - Fix and strengthen print_stack_location
>>  - Missed variable rename
>>  - Copyright
>>  - Rework logic and use continuation state for reattempts
>>  - Merge remote-tracking branch 'upstream_jdk/master' into vmerror_report_register_stack_reentrant
>>  - Restructure os::print_register_info interface
>>  - Code syle and line length
>>  - Merge Fix
>>  - ... and 5 more: https://git.openjdk.org/jdk/compare/2009dc2b...2e12b4a5
>
> src/hotspot/share/utilities/vmError.cpp line 643:
> 
>> 641: # define REATTEMPT_STEP_WITH_NEW_TIMEOUT_IF(s, cond)       \
>> 642:   REATTEMPT_STEP_IF_IMPL(s, cond, true)
>> 643: 
> 
> I'm doubtful about the reset-timeout feature. If something timeouts, the chance is very high it will timeout again. Either because we have a deadlock, or because what we do is simply very slow. One example for very slow is printing callstacks - decoding debug info can be very slow if debug info is loaded e.g. from network share, but it will not get any faster by repeating the attempt.
> 
> With crashes related to printing registers and stack slots, I can see the sense and usefulness of reattempts. But timeouts are both more "sticky" (high chance of happening again) as well as worse than crashes. Customers want the crashing VM to be down quickly, to release all locks and files, so that the replacement VM can start up. 
> 
> So maybe we should scrap the new timeout feature. Would also simplify coding a bit.

I removed it. 

Think I added it originally in the rework to not change the behaviour of the stack trace printing. But if it is as you say that if with source timeouts then without source is also likely to timeout, maybe they should share a timeout.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/11017#discussion_r1193414286


More information about the hotspot-dev mailing list