RFR: 8255678: Add Mutex::try_lock version without rank checks
Daniel D.Daugherty
dcubed at openjdk.java.net
Tue Nov 17 23:56:09 UTC 2020
On Mon, 16 Nov 2020 22:58:35 GMT, Patricio Chilano Mateo <pchilanomate at openjdk.org> wrote:
> Please review the following patch to add method Mutex::try_lock_without_rank_check().
>
> The ranking system is used to prevent deadlocks while trying to acquire ownership of a Mutex/Monitor. It does so by preventing circular waits. For that we assign a rank to each Mutex/Monitor and we enforce that ownership is acquired in order of decreasing rank. Today we validate this order everytime we acquire ownership of a Mutex/Monitor. We can relax this validation for try_lock() in cases where failure to acquire ownership will not block the thread's progress, i.e. cases where the action can just be skipped (e.g. see JDK-8254877). For those cases, even if the Mutex/Monitor is out of rank order the thread will not block if it already has an owner, so circular waits are not possible.
>
> The patch adds method Mutex::try_lock_without_rank_check() which behaves just like the normal try_lock() but skips rank checks. If the call is successful the method sets field _skip_rank_check for that Mutex/Monitor, so that future rank validations (in case the thread acquires another Mutex/Monitor) will not fail when traversing the list of already acquired locks and checking they are in increasing rank order. The field is reset back on unlock(). For Monitors we still validate that wait() is done on the lock with less rank from all the owned locks.
>
> Today rank checks are executed after locking the underlying PlatformMutex/PlatforMonitor. This means that we might end up in an actual deadlock before asserting it. For the wait() case there is a similar situation, where we check that a thread waits on the Monitor with less rank from all the locks it owns after the underlying PlatformMonitor wait() call returns. Since doing those checks there might be already too late I moved them to the beginning of the respective lock/wait methods.
>
> Also, all these rank checks are skipped under certain conditions (Mutex/Monitor of rank native or suspend_resume, or at safepoint) and I didn't change any of that. I realized there are inconsistencies though in skipping the rank check when acquiring a Mutex/Monitor of rank suspend_resume but not when traversing the list of owned locks and checking that those are in increasing rank order.
>
> I noticed the rank related fields are defined within an "#ifndef PRODUCT" macro (optimized and debug) but the code that uses them is defined within "#ifdef ASSERT" macros (debug only), so I did a little clean up there too. Basically from that non PRODUCT block I moved everything to an ASSERT block, except for _allow_vm_block, and consolidated it with the ASSERT block defined at the end of the class definition.
>
> In summary the patch contains the following changes:
> - Add method Mutex::try_lock_without_rank_check()
> - Move rank checks before trying to lock/wait on the underlying PlatformMonitor
> - Clean up definitions for optimized/debug builds
> - Add a new gtest to test for different rank violations scenarios
>
> Tested it in mach5 tier1-6. I also checked it builds for release/optimized/fastdebug cases.
>
> Thanks,
> Patricio
Your call on whether to fix the nits. I like the new test!
Most of the longer stress test runs happen in Tier7 and
Tier8 so you should also run those.
src/hotspot/share/runtime/mutex.cpp line 421:
> 419: thread->print_owned_locks();
> 420: assert(false, "Attempting to acquire lock %s/%d out of order with lock %s/%d -- "
> 421: "possible deadlock", this->name(), this->rank(),least->name(), least->rank());
nit - missing blank - s/this->rank(),/this->rank(), /
src/hotspot/share/runtime/mutex.hpp line 91:
> 89: Mutex* _next; // Used by a Thread to link up owned locks
> 90: Thread* _last_owner; // the last thread to own the lock
> 91: bool _skip_rank_check; // read only by owner whend doing rank checks
nit - typo: s/whend/when/
-------------
Marked as reviewed by dcubed (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/1242
More information about the hotspot-runtime-dev
mailing list