RFR: 8291555: Implement alternative fast-locking scheme [v56]

Dean Long dlong at openjdk.org
Tue Apr 11 20:42:16 UTC 2023


On Tue, 11 Apr 2023 19:58:19 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

>> The `_base` array is only initialized to nullptr in debug builds.  I don't see a release barrier in LockStack::push between the update to _base[] and the update to _top, nor a corresponding acquire barrier when reading.  Doesn't this mean it is possible to racily read an uninitialized junk oop value from _base[], especially on weak memory models?
>
> Yes. The whole LockStack is not meant to be accessed cross-thread, pretty much like any thread's stack is not meant to be accessed like that (including current stack-locking). So what can go wrong?
> With the new locking, we could read junk and compare it to the oop that we're testing against and get a wrong result. We're not going to crash though.
> With the current stack-locking, we would fetch the stack-pointer and check if that address is within the foreign thread's stack. Again, because the other thread is not holding still, we might get a wrong result, but we would not crash.
> So I guess we need to answer the question whether or not jmm_GetThreadInfo() is ok with returning wrong result and what could be the consequences of this. For example, how important is it that the info about the thread(s) is correct and consistent (e.g. what happens if we report two threads both holding the same lock?), etc. But I don't consider this to be part of this PR.
> 
> So my proposal is: leave that code as it is, for now (being racy when inspecting foreign threads, but don't crash). Open a new issue to investigate and possibly fix the problem (maybe by safepointing, maybe by handshaking if that is enough, or maybe we find out we don't need to do anything). Add comments in relevant places to point out the problem like you and David suggested earlier. Would that be ok?

That seems fine to me, as long as we don't crash.  But my understanding is that Generational ZGC will crash if it sees a stale oop.  Isn't it possible that the racing read sees junk that looks to Generational ZGC like a stale oop?  To avoid this, unused slots may need to be set to nullptr even in product builds.  But I'm not a GC expert so maybe there's no problem.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1163306288


More information about the serviceability-dev mailing list