Integrated: 8319048: Monitor deflation unlink phase prolongs time to safepoint
Aleksey Shipilev
shade at openjdk.org
Tue Nov 28 09:52:31 UTC 2023
On Mon, 30 Oct 2023 08:20:37 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
> See the symptoms, reproducer, and analysis in the bug.
>
> There is a major problem in current unlinking code: we only check for safepoint every 1M monitors, which might take a while with large population of dead monitors, prolonging time to safepoint. Even if we spend 1ns per monitor, that's already +1ms in TTSP. In reality, we see double-digit-ms outliers in TTSP. (There is a secondary problem that comes with searching for new `prev` if monitor insertion happened while we were preparing the batch for unlinking; this might theoretically take the unbounded time.)
>
> This PR fixes the issue by providing a smaller batch size for unlinking. The unlinking batch size basically defines two things: a) how often do we check for safepoint (`ObjectSynchronizer::chk_for_block_req` in the method below); and b) how much overhead we have on mutating the monitor lists. If we unlink monitors one by one, then in a worst case, we would do a CAS on `head` and the atomic store for `OM._next` for every monitor, both of which are expensive if done per monitor.
>
> The experiments with the reproducer from the bug shows that the threshold of 500 works well: it mitigates TTSP outliers nearly completely, while still providing the large enough batch size to absorb list mutation overheads. See how bad outliers are in baseline, and how outliers get lower with lower batch, and almost completely disappear at 500. I believe the difference between baseline and `MUB=1M` is short-cut-ting the search for new `prev`.
>
> 
>
> Additional testing:
> - [x] Linux AArch64 server fastdebug, `tier1 tier2 tier3 tier4`
> - [x] Linux x86_64 server fastdebug, `tier1 tier2 tier3 tier4`
> - [x] Ad-hoc performance tests, see above
This pull request has now been integrated.
Changeset: efc39225
Author: Aleksey Shipilev <shade at openjdk.org>
URL: https://git.openjdk.org/jdk/commit/efc392259c64986bbbe880259e95b09058b9076a
Stats: 235 lines in 4 files changed: 221 ins; 0 del; 14 mod
8319048: Monitor deflation unlink phase prolongs time to safepoint
Reviewed-by: ysr, stefank, aboldtch, dcubed
-------------
PR: https://git.openjdk.org/jdk/pull/16412
More information about the hotspot-runtime-dev
mailing list