RFR: 8196083: Avoid locking in OopStorage::release

Wed Feb 7 21:32:56 UTC 2018


On 2/4/18 9:15 AM, Erik Österlund wrote:
> Hi David,
>
> This is starting to go a bit off-topic for this thread. But here goes...
>
> On 2018-02-04 13:55, David Holmes wrote:
>> On 4/02/2018 10:38 PM, Erik Osterlund wrote:
>>> Hi Kim,
>>>
>>> Looks complicated but good.
>>>
>>> It would be great in the future if the deadlock detection system 
>>> could be improved to not trigger such false positives that make us 
>>> implement tricky lock-free code to dodge the obviously false 
>>> positive deadlock assert. But I suppose that is out of scope for this.
>>
>> It isn't a deadlock-detection system it is a deadlock prevention 
>> system. If you honour the lock rankings then you can't get deadlocks. 
>> If you don't honour the lock rankings then you may get deadlocks. 
>> There isn't sufficient information in the ranking alone to know for 
>> sure whether you will or not.
>
> Okay. I guess I should have called it a potential deadlock situation 
> detection system. But it does not prevent deadlocks - that is up to 
> us. And since the checking is dynamic, we are never guaranteed not to 
> get deadlocks.

?? But the system does help with lock ordering problems that is a cause 
of deadlocks, so it does prevent that class of deadlocks.
>
>> If the deadlock possibility is so obviously not actually possible 
>> then that could be captured somehow for the specific locks involved. 
>> But I'm not aware of any tools we have that actually help us track 
>> what locks may concurrently be acquired - if we did then we would not 
>> need rank-based deadlock prevention checks.

>
> What I had in mind is something along the lines of dynamically 
> constructing a global partial ordering of the locks as they are 
> acquired, and to verify the global partial ordering is consistent and 
> not violated. That would be like a more precise version of the 
> manually constructed ordering we have today, and save us the trouble 
> of doing this manual picking of a number X, giving it a silly name 
> nobody understands like "leaf + 3", where leaf is not actually a leaf 
> at all - that's for "special", oh wait no there are more special lock 
> ranks than special. And then as testing is run, either manually 
> shuffling the ranks around to reflect the actual partial ordering the 
> code adheres to, or rewriting the code to be lock-free in fear of 
> getting intermittent false positive asserts triggered in testing after 
> moving ranks around (despite every failing test run actually strictly 
> conforming to a global partial ordering that was just not reflected 
> accurately by the numbers we picked).

I don't think the assert that Kim got was a false positive.  It may have 
detected a deadlock.  Finding these things through testing though is the 
real problem.  I don't know how to do this statically though, unless 
there are static code analysis tools for lock ordering violations (?)
>
> With such an automatic solution, we could also get a better picture of 
> the interactions between the locks when adding a new lock by printing 
> the actual partial ordering of the locks that was found at runtime, 
> instead of trying to figure out which other relevant locks the "+ 3" 
> in "leaf + 3" referred to.
>
> Of course, this is just an idea for the bright future where a magical 
> system can do lock ordering consistency checks automagically without 
> us resorting to complicated lock-free solutions for code that never 
> violates global lock ordering consistency (which is what the system 
> was designed to detect), because it is found easier to write a 
> lock-free solution to the problem at hand than to figure out how best 
> to shuffle the ranks around to capture the actual partial ordering the 
> locks consistently conform to. But for now, since we do not have such 
> a system, and its future existence is merely hypothetical, I am okay 
> with the proposed lock-free solution instead.
>

I agree with you that the system is a partial ordering rather than a 
real ordering and leaf+3, non_leaf-2 has really no meaning.  I would 
prefer a system that is more declarative and I think we should do 
something about this problem in the next release.   Can we collect yours 
and peoples thoughts in:

https://bugs.openjdk.java.net/browse/JDK-8176393

Thanks,
Coleen

> Thanks,
> /Erik
>
>> David
>> -----
>>
>>> Thanks,
>>> /Erik
>>>
>>>> On 3 Feb 2018, at 01:35, Kim Barrett <kim.barrett at oracle.com> wrote:
>>>>
>>>> Please review this change to the OopStorage::release operations to
>>>> eliminate their use of locks.  Rather than directly performing the
>>>> _allocate_list updates when the block containing the entries being
>>>> released undergoes a state transition (full to not-full, not-full to
>>>> empty), we instead record the occurrence of the transition. This
>>>> recording is performed via a lock-free push of the block onto a list
>>>> of such deferred updates, if the block is not already present in the
>>>> list.  Update requests are processed by later allocate and
>>>> delete_empty_block operations.
>>>>
>>>> Also backed out the JDK-8195979 lock rank changes for the JNI mutexes.
>>>> Those are no longer required to nested lock rank ordering errors.
>>>>
>>>> CR:
>>>> https://bugs.openjdk.java.net/browse/JDK-8196083
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~kbarrett/8196083/open.00/
>>>>
>>>> Testing:
>>>> Reproducer from JDK-8195979.
>>>> Mach5 {hs,jdk}-tier{1,2,3}
>>>>
>>>
>