RFR: 8255678: Add Mutex::try_lock version without rank checks [v5]
Coleen Phillimore
coleenp at openjdk.java.net
Thu Nov 19 13:24:06 UTC 2020
On Thu, 19 Nov 2020 04:06:18 GMT, Patricio Chilano Mateo <pchilanomate at openjdk.org> wrote:
>> Please review the following patch to add method Mutex::try_lock_without_rank_check().
>>
>> The ranking system is used to prevent deadlocks while trying to acquire ownership of a Mutex/Monitor. It does so by preventing circular waits. For that we assign a rank to each Mutex/Monitor and we enforce that ownership is acquired in order of decreasing rank. Today we validate this order everytime we acquire ownership of a Mutex/Monitor. We can relax this validation for try_lock() in cases where failure to acquire ownership will not block the thread's progress, i.e. cases where the action can just be skipped (e.g. see JDK-8254877). For those cases, even if the Mutex/Monitor is out of rank order the thread will not block if it already has an owner, so circular waits are not possible.
>>
>> The patch adds method Mutex::try_lock_without_rank_check() which behaves just like the normal try_lock() but skips rank checks. If the call is successful the method sets field _skip_rank_check for that Mutex/Monitor, so that future rank validations (in case the thread acquires another Mutex/Monitor) will not fail when traversing the list of already acquired locks and checking they are in increasing rank order. The field is reset back on unlock(). For Monitors we still validate that wait() is done on the lock with less rank from all the owned locks.
>>
>> Today rank checks are executed after locking the underlying PlatformMutex/PlatforMonitor. This means that we might end up in an actual deadlock before asserting it. For the wait() case there is a similar situation, where we check that a thread waits on the Monitor with less rank from all the locks it owns after the underlying PlatformMonitor wait() call returns. Since doing those checks there might be already too late I moved them to the beginning of the respective lock/wait methods.
>>
>> Also, all these rank checks are skipped under certain conditions (Mutex/Monitor of rank native or suspend_resume, or at safepoint) and I didn't change any of that. I realized there are inconsistencies though in skipping the rank check when acquiring a Mutex/Monitor of rank suspend_resume but not when traversing the list of owned locks and checking that those are in increasing rank order.
>>
>> I noticed the rank related fields are defined within an "#ifndef PRODUCT" macro (optimized and debug) but the code that uses them is defined within "#ifdef ASSERT" macros (debug only), so I did a little clean up there too. Basically from that non PRODUCT block I moved everything to an ASSERT block, except for _allow_vm_block, and consolidated it with the ASSERT block defined at the end of the class definition.
>>
>> In summary the patch contains the following changes:
>> - Add method Mutex::try_lock_without_rank_check()
>> - Move rank checks before trying to lock/wait on the underlying PlatformMonitor
>> - Clean up definitions for optimized/debug builds
>> - Add a new gtest to test for different rank violations scenarios
>>
>> Tested it in mach5 tier1-6. I also checked it builds for release/optimized/fastdebug cases.
>>
>> Thanks,
>> Patricio
>
> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
>
> early test check_can_be_skipped + extra try_lock test
Looks great. Thanks!
-------------
Marked as reviewed by coleenp (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/1242
More information about the hotspot-runtime-dev
mailing list