RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"

David Holmes dholmes at openjdk.org
Tue Sep 26 01:12:23 UTC 2023


On Mon, 25 Sep 2023 20:52:28 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:

>> With the new lightweight locking, when a thread T1 holds a lightweight lock on object O, and another thread T2 comes in and also wants to acquire that lock, it inflates the lock to a monitor. And since it cannot determine which threads currently locks O, it installs a special owner - which we call anonymous owner - in the ObjectMonitor, it basically means 'we don't know'. As soon as T1 unlocks O, it observes that the lock is now inflated and set to anonymous owner, installs itself as the proper owner of the ObjectMonitor and proceeds to unlock the ObjectMonitor - thereby handing over ownership to T2.
>> 
>> The specific race here is that SA sees an anonymously locked ObjectMonitor and tries to find the owning thread, and fails, presumably because that thread has moved on and unlocked the object in the meantime.
>> 
>> However, the race is more general, and pre-(lightweight-locking-)existing: if SA observes a locked object, or a thread which holds a particular lock, by the time it is reporting, this information may (likely) be wrong because the thread has, in the meantime, moved on. This in itself may not be a big problem. What may be worse is that, if SA prints stacks of all threads and also prints locking information, the overall information may be *inconsistent*. For example, it may report that two threads hold the same object.
>> Also, I am not sure how deep SA goes. Telling from the stack-trace in the original report, it looks like there is a DeadlockDetector involved. If threads don't hold still while we try to detect deadlocks, we may report false positives. (We would perhaps not report false negatives, because deadlocked thread by definition don't move ;-) )
>> 
>> I don't know enough about SA and how it deals with locking to judge how much of a problem all that is, but I think the pre-existing code (before LW locking) did not warn about locking raciness, and I don't really see that we should do this for the anonymous case.
>
>> The specific race here is that SA sees an anonymously locked ObjectMonitor and tries to find the owning thread, and fails, presumably because that thread has moved on and unlocked the object in the meantime.
> 
> But you said that when T1 starts the process of unlocking O, it sees the anonymous owner and it sets T1 as the owner before unlocking and handing off to T2. So I don't see how T1 could have "moved on". If T2 is blocked on O and O an anonymous owner, then T1 must still own it. 
> 
> SA always deals with a snapshot of the JVM state. Once SA attaches, no threads are running. So if O has an anonymous owner, you don't have to worry about the owner releasing the monitor while you are looking for the owner.
> 
> The question then becomes is there a short window while releasing the monitor that O still shows up as having an anonymous owner, but T1 has already released it. From your description of the order of things, this doesn't seem possible.

Surely jstack thread dump and deadlock check _has_ to run at a safepoint?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336519288


More information about the serviceability-dev mailing list