RFR: 8318986: Improve GenericWaitBarrier performance [v7]
Aleksey Shipilev
shade at openjdk.org
Tue Nov 21 09:36:31 UTC 2023
> See the symptoms, reproducer and analysis in the bug.
>
> Current code waits on `disarm()`, which effectively stalls leaving the safepoint if some threads lag behind. Having more runnable threads than CPUs nearly guarantees that we would wait for quite some time, but it also reproduces well if you have enough threads near the CPU count. Just waiting at `arm()` is insufficient, but we can have several `Semaphores` to do what we want.
>
> This PR implements a more efficient `GenericWaitBarrier` to recover the performance. Most of the implementation discussion is in the code comments. The key observation that drives this work is that we want to reuse `Semaphore` and related counters without being stuck waiting for threads to leave.
>
> (AFAICS, futex-based `LinuxWaitBarrier` does roughly the same, but handles this reuse on futex side, by assigning the "address" per futex.)
>
> This issue affects everything except Linux. I initially found this on my M1 Mac, but pretty sure it is easy to reproduce on Windows as well. The safepoints from the reproducer in the bug improved dramatically on a Mac. Note not only the orders of magnitude better safepoint times, but also the several times more GC safepoints in the time-bound allocation test, which means the attainable GC throughput is similarly better, since we don't waste time at this wait barrier.
>
> 
>
> Additional testing:
> - [x] MacOS AArch64 server fastdebug, `tier1`
> - [x] Linux x86_64 server fastdebug, `tier1 tier2 tier3` (generic wait barrier enabled explicitly)
> - [x] Linux AArch64 server fastdebug, `tier1 tier2 tier3` (generic wait barrier enabled explicitly)
> - [x] MacOS AArch64 server fastdebug, `tier2 tier3`
> - [x] Linux x86_64 server fastdebug, `tier4` (generic wait barrier enabled explicitly)
> - [x] Linux AArch64 server fastdebug, `tier4` (generic wait barrier enabled explicitly)
Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision:
- Do not SpinYield at disarm loop
- Merge branch 'master' into JDK-8318986-generic-wait-barrier
- Drop the Linux check in preparation for integration
- Merge branch 'master' into JDK-8318986-generic-wait-barrier
- Merge branch 'master' into JDK-8318986-generic-wait-barrier
- Rework paddings
- Encode barrier tag into state, resolving another race condition
- Simple review feedback fixes: tracking wakeup numbers, reflowing some methods
- Merge branch 'master' into JDK-8318986-generic-wait-barrier
- Touchups
- ... and 5 more: https://git.openjdk.org/jdk/compare/0e39d942...32b0a9c6
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/16404/files
- new: https://git.openjdk.org/jdk/pull/16404/files/191c0dbb..32b0a9c6
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=16404&range=06
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=16404&range=05-06
Stats: 12996 lines in 291 files changed: 6015 ins; 4707 del; 2274 mod
Patch: https://git.openjdk.org/jdk/pull/16404.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/16404/head:pull/16404
PR: https://git.openjdk.org/jdk/pull/16404
More information about the hotspot-dev
mailing list