RFR: 8334594: Generational ZGC: Deadlock after OopMap rewrites in 8331572

Erik Österlund eosterlund at openjdk.org
Thu Jun 20 10:00:11 UTC 2024


On Thu, 20 Jun 2024 09:11:04 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> As shown in the bug, there are cases when acquiring the `ServiceLock` for opportunistic notification leads to deadlock. We can untie the deadlock by checking if `ServiceLock` can be acquired on triggering path, and never blocking otherwise.
> 
> Additional testing:
>  - [ ] Linux x86_64 service fastdebug, `all`
>  - [ ] Linux AArch64 service fastdebug, `all`

This looks good as a direct fix to the bug. I agree though with the assessment that we should use a different lock for the queue going forward.

It's also interesting to read the placement of these hooks. For the GC VM operation there is a comment saying that we probably just used the oop map cache so now is a good time to trigger cleanup.

The same comment is present for ZGC and Shenandoah where we perform safepoints. But there the situation is typically the exact opposite to what the comment suggests then. Since we perform concurrent root scanning, we are *just about to* use the oop map cache, and the placement is probably the most unfortunate instead. It seems like we clean the caches *just before* using them instead just after. But again, that seems like a follow-up thing.

-------------

Marked as reviewed by eosterlund (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/19800#pullrequestreview-2129994072


More information about the shenandoah-dev mailing list