RFR: 8273695: Safepoint deadlock on VMOperation_lock

David Holmes dholmes at openjdk.java.net
Wed Sep 22 01:38:01 UTC 2021


On Tue, 21 Sep 2021 13:17:43 GMT, Robbin Ehn <rehn at openjdk.org> wrote:

> We should not do any processing in SM::should_process().
> The query is used to determine if we need to process safepoint/handshakes and with this change StackWaterMark.
> When locking a Mutex which may be acquired in such processing, we must release that Mutex before we can start processing, otherwise we can deadlock.
> 
> This change adds a method to determine if StackWaterMarkSet::on_safepoint() will do any processing.
> In that case there are poll is armed, we do not allow suspend handshakes and there is no safepoint and no non-suspend handshakes, we still return true if StackWaterMarkSet needs processing.
> Thus the code querying should release any such Mutex and call process SM::process_if_requested().
> 
> The cross_modify_fence() do not have any such state, so we still need to emit that before returning false if poll is armed.
> 
> Passes t1-t4 and local stressing.

Hi Robbin,

Not knowing anything about stackwatermarks and their use I can't really review this in any detail. Avoiding the call to `StackWatermarkSet::on_safepoint` so that `should_process` is just a query does seem like a reasonable thing to do to avoid the problem. That said I'm more generally concerned that something hooked into the lowest-levels of the safepoint/handshake code can itself have a dependency on execution of a safepoint operation!

Cheers,
David

-------------

PR: https://git.openjdk.java.net/jdk/pull/5613


More information about the hotspot-runtime-dev mailing list