RFR: 8291555: Implement alternative fast-locking scheme [v56]

Stefan Karlsson stefank at openjdk.org
Wed Apr 12 05:29:21 UTC 2023


On Wed, 12 Apr 2023 02:08:08 GMT, David Holmes <dholmes at openjdk.org> wrote:

>> That seems fine to me, as long as we don't crash.  But my understanding is that Generational ZGC will crash if it sees a stale oop.  Isn't it possible that the racing read sees junk that looks to Generational ZGC like a stale oop?  To avoid this, unused slots may need to be set to nullptr even in product builds.  But I'm not a GC expert so maybe there's no problem.
>
> The old code is "racy but safe - it basically answers the question "what thread held the lock at the time I was asking?" and if we get a stack-addr as the owner at the time we ask, and that stack-address belongs to a given thread t then we report t as the owner. The fact t may have released the lock as soon as we read the stack-addr is immaterial.
> 
> The new code may be a different matter however. Now the race involves oops, and potentially stale ones IIUC what Stefan is saying. So now the race is not safe, and potentially may crash.

> That seems fine to me, as long as we don't crash. But my understanding is that Generational ZGC will crash if it sees a stale oop. Isn't it possible that the racing read sees junk that looks to Generational ZGC like a stale oop? To avoid this, unused slots may need to be set to nullptr even in product builds. But I'm not a GC expert so maybe there's no problem.

Generational ZGC has verification code in fastdebug builds that try to detect stale oops. However, the current LockStack implementation seems to always clear unused slots when running in debug builds. That minimizes the risk that the verification code would find stale oops in the LockStack.

Regarding release build, given that the LockStack code doesn't dereference any of the contained oops and we don't have oop verification code in release builds, I don't see of ZGC would crash because of this race. Note however that these kind of races are technically undefined behavior, so I wouldn't be too confident that this code is safe.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1163627980


More information about the serviceability-dev mailing list