RFR: 8334482: Shenandoah: Deadlock when safepoint is pending during nmethods iteration
Aleksey Shipilev
shade at openjdk.org
Wed Jul 24 20:04:03 UTC 2024
There is a problem in interaction between STS taken in `ShenandoahConcurrentWeakRootsEvacUpdateTask::work` and nmethod list iteration. If we start the nmethod iteration in `ShCWREUT` constructor, proceed to STS and block, any waiter that waits for the end of iteration would block until safepoint is over. If that waiter waits without a safepoint check, like anyone waiting on `CodeCache_lock`, e.g. Sweeper (not really in mainline, but we seen a case in JDK 17), or a JIT compiler patching code, then we would have a deadlock.
A solution is to initialize iteration only immediately before the use of iterator. This will pass any pending STSes and give much less risk for deadlock, as long as nothing in nmethod handling safepoints.
Unfortunately, it is really hard to reproduce in mainline, since the absence of Sweeper makes this deadlock exceedingly rare.
Additional testing:
- [x] Reproducer on JDK 17u with this patch applied now does not deadlock
- [x] Linux AArch64 server fastdebug, `all` with `-XX:+UseShenandoahGC`
- [x] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC`
-------------
Commit messages:
- Better worker counting, since some workers can already be parked at STS
- Counting threads better
- Touchups
- Fix
Changes: https://git.openjdk.org/jdk/pull/20309/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20309&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8334482
Stats: 72 lines in 4 files changed: 25 ins; 33 del; 14 mod
Patch: https://git.openjdk.org/jdk/pull/20309.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/20309/head:pull/20309
PR: https://git.openjdk.org/jdk/pull/20309
More information about the hotspot-gc-dev
mailing list