RFR: 8354560: Exponentially delay subsequent native thread creation in case of EAGAIN [v3]

David Holmes dholmes at openjdk.org
Thu May 8 07:23:56 UTC 2025


On Tue, 6 May 2025 22:52:59 GMT, Yannik Stradmann <duke at openjdk.org> wrote:

>> This change introduces an exponential backoff when hitting `EAGAIN` during native thread creation in hotspot.
>> 
>> In contrast to the current solution, where we retry to create a native thread up to three times in a tight loop, hotspot will will thereby be more kind to an already depleted resource, reduce stress on the kernel and become more robust on systems under high load.
>> 
>> The proposed modifications to `os_linux.cpp` have substantially improved system stability in a mid-sized Jenkins cluster and have been in production within our systems over the past three years. I have verbatim ported these to the other platforms, which previously also relied on identical logic.
>
> Yannik Stradmann has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
> 
>  - Merge remote-tracking branch 'upstream/master' into robust_pthread
>  - Fix build on Windows: Sleep() only accepts milliseconds
>  - Exponentially delay native thread creation retries

Sorry for the delay but I have been on vacation (and noone else picked this up).

Seems okay in principle but a couple of small issues below.

Thanks

src/hotspot/os/bsd/os_bsd.cpp line 647:

> 645:     pthread_t tid;
> 646:     int ret = 0;
> 647:     {

Why the extra block scope?

src/hotspot/os/bsd/os_bsd.cpp line 661:

> 659:         }
> 660: 
> 661:         log_warning(os, thread)("Failed to start native thread (%s), retrying after %dus.", os::errno_name(ret), next_delay);

I don't think we want to issue a warning unless we completely fail to start the native thread. For debugging purposes this may be better as a log_debug,

-------------

Changes requested by dholmes (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/24682#pullrequestreview-2824072686
PR Review Comment: https://git.openjdk.org/jdk/pull/24682#discussion_r2079056634
PR Review Comment: https://git.openjdk.org/jdk/pull/24682#discussion_r2079059986


More information about the hotspot-runtime-dev mailing list