RFR: 8318986: Improve GenericWaitBarrier performance

Aleksey Shipilev shade at openjdk.org
Wed Nov 1 23:11:08 UTC 2023


See the symptoms, reproducer and analysis in the bug.

Current code waits on `disarm()`, which effectively stalls leaving the safepoint if some threads lag behind. Having more runnable threads than CPUs nearly guarantees that we would wait for quite some time. Just waiting at `arm()` is insufficient, but we can have several `Semaphores` to do what we want. 

This PR implements a more efficient `GenericWaitBarrier` to recover the performance. Most of the implementation discussion is in the code comments. The key observation that drives this work is that we want to reuse `Semaphore` and related counters without being stuck waiting for threads to leave.

(AFAICS, futex-based `LinuxWaitBarrier` does roughly the same, but handles this reuse on futex side, by assigning the "address" per futex.)

This issue affects everything except Linux. I initially found this on my M1 Mac, but pretty sure it is easy to reproduce on Windows as well. The safepoints from the reproducer in the bug improved dramatically on a Mac. Note not only the two orders of magnitude better safepoint times, but also the >2x more GC safepoints in the time-bound allocation test, which means the attainable GC throughput is at least 2x more, since we don't waste time at this wait barrier.

![plot-generic-wait-barrier-macos](https://github.com/openjdk/jdk/assets/1858943/28cf22d3-b5ca-44fb-bde7-47189d14b47b)

Additional testing:
  - [x] MacOS AArch64 server fastdebug, `tier1`
  - [x] Linux x86_64 server fastdebug, `tier1 tier2 tier3` (generic wait barrier enabled explicitly)
  - [x] Linux AArch64 server fastdebug, `tier1 tier2 tier3` (generic wait barrier enabled explicitly)
  - [x] MacOS AArch64 server fastdebug, `tier2 tier3`
  - [ ] Linux x86_64 server fastdebug, `tier4` (generic wait barrier enabled explicitly)
  - [ ] Linux AArch64 server fastdebug, `tier4` (generic wait barrier enabled explicitly)

-------------

Commit messages:
 - Fix

Changes: https://git.openjdk.org/jdk/pull/16404/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16404&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8318986
  Stats: 186 lines in 3 files changed: 126 ins; 17 del; 43 mod
  Patch: https://git.openjdk.org/jdk/pull/16404.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16404/head:pull/16404

PR: https://git.openjdk.org/jdk/pull/16404


More information about the hotspot-dev mailing list