RFR: JDK-8265298: Hard VM crash when deadlock between "access" and higher ranked lock is detected

Aleksey Shipilev shade at openjdk.java.net
Thu Apr 15 19:44:35 UTC 2021


On Thu, 15 Apr 2021 18:44:06 GMT, Patricio Chilano Mateo <pchilanomate at openjdk.org> wrote:

>> I stumbled upon this when doing some Shenandoah work. The development code tried to lock the `leaf` lock, while already holding the `access` lock. Normally it would have been detected by VM, but instead, we tried to recursively acquire `tty_lock` for `Thread::print_owned_locks`. But that `tty` lock is still ranked higher than `access`, so deadlock detection triggers over and over again until we run out of stack and crash hard. `tty` and `access` are ranked this way because of [JDK-8214315](https://bugs.openjdk.java.net/browse/JDK-8214315). Read the rest in the bug.  
>> 
>> I believe the way out is to only enter `Thread::print_owned_locks` when we know deadlock detection code would not run in circles. New test for `access` and `leaf` shows the original failure. Another new test checks `tty` and `special` to verify that the check should be `> tty`, not `>= tty`.
>> 
>> Additional testing:
>>  - [x] Linux x86_64 fastdebug tier1
>>  - [x] New regression tests (fail without the patch, pass wit it)
>
> test/hotspot/gtest/runtime/test_mutex_rank.cpp line 210:
> 
>> 208:   monitor_rank_access->lock_without_safepoint_check();
>> 209:   monitor_rank_leaf->lock_without_safepoint_check();
>> 210:   monitor_rank_leaf->wait_without_safepoint_check(1);
> 
> If you want to exercise the wait() error case you should lock the leaf lock first and then the access one. Then you will get the assert at wait_without_safepoint_check(1). You will have to fix the error message that we check for above to be the one used in the wait case.

The intent is to lock `access` first, then `leaf` -- after all, that was the issue I initially found. About `wait`: I copy-pasted the shape of the test from the `monitor_wait_rank_special` above. I thought it was dubious that we basically expect to assert at `monitor_rank_leaf->lock_without_safepoint_check();`, but thought having `wait` makes the code a bit more readable. I can just remove `wait` here.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3524


More information about the hotspot-runtime-dev mailing list