RFR: 8291555: Implement alternative fast-locking scheme [v56]
Roman Kennke
rkennke at openjdk.org
Tue Apr 11 20:00:23 UTC 2023
On Tue, 11 Apr 2023 19:35:36 GMT, Dean Long <dlong at openjdk.org> wrote:
>> Given that the race with new lightweight locking is virtually the same as the race
>> with legacy stack locking, please do not put back the 'LockingMode == 2' check
>> which would make `jmm_GetThreadInfo()` calls slower with new lightweight locking
>> than with legacy stack locking.
>>
>> Perhaps I'm not understanding the risk of what @stefank means with:
>>
>> It looks to me like the code could read racingly read the element just above _top,
>> which could contain a stale oop. If the address of the stale oop matches the
>> address of o then contains would incorrectly return true.
>
> The `_base` array is only initialized to nullptr in debug builds. I don't see a release barrier in LockStack::push between the update to _base[] and the update to _top, nor a corresponding acquire barrier when reading. Doesn't this mean it is possible to racily read an uninitialized junk oop value from _base[], especially on weak memory models?
Yes. The whole LockStack is not meant to be accessed cross-thread, pretty much like any thread's stack is not meant to be accessed like that (including current stack-locking). So what can go wrong?
With the new locking, we could read junk and compare it to the oop that we're testing against and get a wrong result. We're not going to crash though.
With the current stack-locking, we would fetch the stack-pointer and check if that address is within the foreign thread's stack. Again, because the other thread is not holding still, we might get a wrong result, but we would not crash.
So I guess we need to answer the question whether or not jmm_GetThreadInfo() is ok with returning wrong result and what could be the consequences of this. For example, how important is it that the info about the thread(s) is correct and consistent (e.g. what happens if we report two threads both holding the same lock?), etc. But I don't consider this to be part of this PR.
So my proposal is: leave that code as it is, for now (being racy when inspecting foreign threads, but don't crash). Open a new issue to investigate and possibly fix the problem (maybe by safepointing, maybe by handshaking if that is enough, or maybe we find out we don't need to do anything). Add comments in relevant places to point out the problem like you and David suggested earlier. Would that be ok?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1163263926
More information about the serviceability-dev
mailing list