RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
David Holmes
dholmes at openjdk.org
Tue Sep 26 02:17:10 UTC 2023
On Tue, 26 Sep 2023 01:41:38 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:
>> Correction to above:
>>
>> threads = VM.getVM().getThreads();
>> heap = VM.getVM().getObjectHeap();
>> createThreadTable(); // calls getThreads() again
>>
>> The VM caches the set of threads ie the snapshot, so three sets are not possible. But AFAICS the thread snapshot and heap snapshot are not atomic, so the set of threads could have changed, and the state of threads also.
>
>> Surely jstack thread dump and deadlock check _has_ to run at a safepoint?
>
> The reality is that JVM is rarely at a safepoint (unless perhaps when all threads are blocked), and therefore jstack rarely is done at a safepoint. This is the world SA lives in. The understanding is that SA debugging features may give inaccurate info, or possibly not work at all.
>
>> If the SA is working from a snapshot then it has to create that snapshot atomically. It can't snapshot the threads, then snapshot the heap.
>
> It doesn't snapshot. It does a debugger attach, which suspends the process. It then starts to read in pages from the process to do things like jstack. The process state does not change while it does this. It's not really any different than gdb in this regard. gdb does not let the process state change unless you use a command that allows execution such "step" or "continue". As long as you avoid the commands that allow process execution, you can debug without worrying about the process state changing. However, even GDB debugging has the same issues with JVM safe pointing (or lack thereof). If the JVM crashes and you start looking at certain data, it might be inconsistent. There's no way to force a safepoint once there is a crash.
>
>> @plummercj the SA code sees T2 is pending on the monitor for object O, which is locked anonymously but actually by T1. The SA code then goes hunting for the owner. But the VM is not standing still...
>
> The VM is standing still. There is no process execution while all of this happens. jstack and the deadlock detection are fully executed while the JVM process is halted. There is no JVM state change during any of this.
Ah! I guess we get used to talking about "at a safepoint" when we really mean "at a fixed point in time". So the VM is not necessarily at a safepoint, but everything is fixed. So invariants may not hold, but the state cannot change. So in the current context the anonymous owner should be found ... I guess the question to be answered is how the code tries to find an anonymous owner? I'm not sure how you can find it.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336549922
More information about the serviceability-dev
mailing list