RFR: JDK-8303861: Error handling step timeouts should never be blocked by OnError and others [v2]

Thomas Stuefe stuefe at openjdk.org
Fri Mar 10 07:36:17 UTC 2023


On Fri, 10 Mar 2023 07:20:00 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> src/hotspot/share/runtime/nonJavaThread.cpp line 274:
>> 
>>> 272: 
>>> 273:         // Wait a second, then recheck for timeout.
>>> 274:         os::naked_short_sleep(999);
>> 
>> Harmless change but I don't see why we need sub-second resolution when the ErrorLogTimeout is in seconds. ??
>
> This matters because it also applies to each step timeout too. If we check only once per second, we overshoot each timeout by up to one second. This overshooting happens for every step timeout. If we are in a situation like here where we ignore the global step timeout, and we keep running into deadlocks e.g. in malloc, we will encounter a lot of step timeouts; these would add up.

Note that this matters even more in the context of https://github.com/openjdk/jdk/pull/11017, since that will increase the granularity for each error reporting step, potentially exposing us to a lot more individual timeouts.

-------------

PR: https://git.openjdk.org/jdk/pull/12936


More information about the hotspot-dev mailing list