RFR: 8253429: Error reporting should report correct state of terminated/aborted threads
Daniel D.Daugherty
dcubed at openjdk.java.net
Wed Sep 30 17:03:10 UTC 2020
On Tue, 29 Sep 2020 15:17:46 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>> For some non-JavaThread, their object instances can outlast threads' lifespan. For example, we still can query/report
>> thread's state after thread terminated.
>> But the query/report currently returns wrong state. E.g. a terminated thread appears to be alive and seemly has valid
>> thread stack, etc.
>> This patch sets non-JavaThread's state to ZOMBIE just before it terminates, so that we can distinguish terminated
>> thread from live thread.
>> Also, thread should not report its SMR info, if it has terminated or it never started (thread->osthread() == NULL).
>>
>> Note: Java thread does not have such issue, its thread object is deleted before thread terminates.
>
> Hi Zhengyu,
>
> I'm updating my review after reading through your conversation with David. Save for small nits this seem fine.
>
> Cheers, Thomas
I think we're approaching this problem incorrectly. David mentioned this in
the bug report:
> As the reporting is done by the thread closure of the target subsystem
> this is not a runtime issue in this case but a GC issue.
To me, the first part of that sentence is the important part. It is indeed a
thread closure that causes us to reach the terminated thread. It is also a
thread closure that is used by Thread-SMR to determine when a thread's
ThreadsList protects JavaThreads.
In particular:
src/hotspot/share/runtime/threadSMR.cpp:
bool ThreadsSMRSupport::is_a_protected_JavaThread(JavaThread *thread) {
uses a ScanHazardPtrGatherProtectedThreadsClosure passed to
ThreadsSMRSupport::threads_do() to gather all the protected
JavaThread*.
This threads_do() function applies the closure to all threads in the
system: JavaThreads on 'list' and all the non-JavaThreads:
src/hotspot/share/runtime/threadSMR.cpp:
void ThreadsSMRSupport::threads_do(ThreadClosure *tc, ThreadsList *list) {
list->threads_do(tc);
Threads::non_java_threads_do(tc);
}
So if a particular non-JavaThread is still found via Threads::non_java_threads_do(),
then any ThreadsList that it holds protects JavaThread*'s even if that non-JavaThread
has terminated. That means that calling ThreadsSMRSupport::print_info_on() is a
valid thing to do because the non-JavaThread is still participating in Thread-SMR
related decisions.
I have no problem with the part where we set the ZOMBIE state as a marker
for a terminated non-JavaThread, but we need to determine why that
terminated thread is still being found by Threads::non_java_threads_do()
and whether it is safe to remove that non-JavaThread from whatever list
is holding it.
-------------
PR: https://git.openjdk.java.net/jdk/pull/341
More information about the hotspot-runtime-dev
mailing list