RFR: 8336587: failure_handler lldb command times out on macosx-aarch64 core file

Doug Simon dnsimon at openjdk.org
Thu Jul 18 07:54:37 UTC 2024


On Tue, 16 Jul 2024 23:59:09 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:

> I was looking at the failure_handler output for the lldb command on a macosx-aarch64 core file (it is trying to use lldb to get a back trace of all threads), and noticed it timed out:
> 
> 
> ----------------------------------------
> [2024-07-15 05:15:47] [<snip>/usr/bin/lldb, --core, <snip>/core.92643, <snip>/bin/java, -o, thread backtrace all, -o, quit] timeout=20000 in <snip>
> ----------------------------------------
> (lldb) target create "<snip>/bin/java" --core "<snip>/core.92643"
> WARNING: tool timed out: killed process after 20000 ms
> ----------------------------------------
> [2024-07-15 05:16:07] exit code: -2 time: 20163 ms
> ----------------------------------------
> 
> 
> 20 seconds is the failure_handler default timeout for all commands. Core files on macosx-aarch64 tend to be very large. This one was over 13gb. On my MBPro it took 30 seconds. I bumped up the timeout to 60 seconds and reproduce the same crash in mach5 (more than once), and it usually took about 55 seconds for the lldb command, but it did succeed with the longer timeout. I think we should change the timeout to even more than 60 seconds just to make sure we won't see timeouts. 120 seconds is probably a good amount.

Thanks for this change - thread dumps are often crucial for investigating time outs.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20206#issuecomment-2235856819


More information about the hotspot-dev mailing list