RFR: 8291555: Implement alternative fast-locking scheme [v56]
David Holmes
dholmes at openjdk.org
Wed Apr 12 02:11:14 UTC 2023
On Tue, 11 Apr 2023 20:40:14 GMT, Dean Long <dlong at openjdk.org> wrote:
>> Yes. The whole LockStack is not meant to be accessed cross-thread, pretty much like any thread's stack is not meant to be accessed like that (including current stack-locking). So what can go wrong?
>> With the new locking, we could read junk and compare it to the oop that we're testing against and get a wrong result. We're not going to crash though.
>> With the current stack-locking, we would fetch the stack-pointer and check if that address is within the foreign thread's stack. Again, because the other thread is not holding still, we might get a wrong result, but we would not crash.
>> So I guess we need to answer the question whether or not jmm_GetThreadInfo() is ok with returning wrong result and what could be the consequences of this. For example, how important is it that the info about the thread(s) is correct and consistent (e.g. what happens if we report two threads both holding the same lock?), etc. But I don't consider this to be part of this PR.
>>
>> So my proposal is: leave that code as it is, for now (being racy when inspecting foreign threads, but don't crash). Open a new issue to investigate and possibly fix the problem (maybe by safepointing, maybe by handshaking if that is enough, or maybe we find out we don't need to do anything). Add comments in relevant places to point out the problem like you and David suggested earlier. Would that be ok?
>
> That seems fine to me, as long as we don't crash. But my understanding is that Generational ZGC will crash if it sees a stale oop. Isn't it possible that the racing read sees junk that looks to Generational ZGC like a stale oop? To avoid this, unused slots may need to be set to nullptr even in product builds. But I'm not a GC expert so maybe there's no problem.
The old code is "racy but safe - it basically answers the question "what thread held the lock at the time I was asking?" and if we get a stack-addr as the owner at the time we ask, and that stack-address belongs to a given thread t then we report t as the owner. The fact t may have released the lock as soon as we read the stack-addr is immaterial.
The new code may be a different matter however. Now the race involves oops, and potentially stale ones IIUC what Stefan is saying. So now the race is not safe, and potentially may crash.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1163491093
More information about the serviceability-dev
mailing list