RFR: 8348402: PerfDataManager stalls shutdown for 1ms [v3]

David Holmes dholmes at openjdk.org
Fri Jan 31 04:41:48 UTC 2025


On Thu, 30 Jan 2025 18:51:08 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> Found this when studying Leyden performance. [JDK-8049304](https://bugs.openjdk.org/browse/JDK-8049304) added 1ms sleep on destruction path to catch up with threads updating the counters. 
>> 
>> I was not able to confidently prove the deletion race is benign. Even though `sleep` is not really a fix for that race either, I think it is safer to use GlobalCounter to coordinate deletions.
>> 
>> This improves startup (roundtrip) tests for the expected 1ms:
>> 
>> 
>> $ hyperfine -w 10 -r 100 ...
>> 
>> # Before
>> Benchmark 1: build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xmx128m Hello
>>   Time (mean ± σ):      20.0 ms ±   0.3 ms    [User: 11.7 ms, System: 16.0 ms]
>>   Range (min … max):    19.0 ms …  21.7 ms    1000 runs
>> 
>> # After
>> Benchmark 1: build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xmx128m Hello
>>   Time (mean ± σ):      19.0 ms ±   0.3 ms    [User: 11.9 ms, System: 15.8 ms]
>>   Range (min … max):    18.2 ms …  20.8 ms    1000 runs
>> 
>> 
>> Additional testing:
>>  - [x] Linux AArch64 server fastdebug, `all`
>>  - [x] Linux x86_64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision:
> 
>  - Merge branch 'master' into JDK-8348402-perfdatamanager-stall
>  - Also add a blurb in usage doc
>  - Only protect one dodgy place
>  - Merge branch 'master' into JDK-8348402-perfdatamanager-stall
>  - Fix minimal build
>  - Use GlobalCounter to coordinate deletions

I like the introduction of OM_PERFDATA_SAFE_OP. A few suggestions on comments below.

src/hotspot/share/runtime/objectMonitor.cpp line 946:

> 944:     // Note that the counter is not protected by a lock or updated by atomics.
> 945:     // That is by design - we trade "lossy" counters which are exposed to
> 946:     // races during updates for a lower probe effect.

I would suggest restoring the first three deleted lines as the counter update is still racy/lossy.

src/hotspot/share/runtime/objectMonitor.cpp line 1075:

> 1073:     // Note that the counter is not protected by a lock or updated by atomics.
> 1074:     // That is by design - we trade "lossy" counters which are exposed to
> 1075:     // races during updates for a lower probe effect.

I would suggest restoring the first three deleted lines as the counter update is still racy/lossy.

src/hotspot/share/runtime/objectMonitor.hpp line 214:

> 212:   // Only perform a PerfData operation if the PerfData object has been
> 213:   // allocated and if the PerfDataManager has not freed the PerfData
> 214:   // objects which can happen at normal VM shutdown. Additionally, we

This commentary also applies to the OM_PERFDATA_OP case. But we can also say to use that form when not safepoint-safe, and the below when safepoint-safe.

-------------

Changes requested by dholmes (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/23293#pullrequestreview-2585648557
PR Review Comment: https://git.openjdk.org/jdk/pull/23293#discussion_r1936669272
PR Review Comment: https://git.openjdk.org/jdk/pull/23293#discussion_r1936666464
PR Review Comment: https://git.openjdk.org/jdk/pull/23293#discussion_r1936668679


More information about the hotspot-runtime-dev mailing list