RFC: more robust handling of terminated but still attached threads
David Holmes
david.holmes at oracle.com
Tue Jul 3 09:21:56 UTC 2018
Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
We hit asserts or trigger SEGVs when we try to operate on a native
thread ID for a JNI-attached thread that has actually terminated but
which did not detach first. It still appears in the threadsList and we
try to process it during DumpOnExit (but there are probably other
operations that could run into this in the general case).
Fixing the tests is easy. But the more general question is how to make
the VM code more robust in the face of this situation.
At the lowest level we can watch for ESRCH from pthread_* functions and
try to program in alternate logic that gives some "result" for that thread.
At higher-level we may be able to heuristically guess that the native
thread has terminated and so skip it in ALL_JAVA_THREADS and similar
constructors. For example pthread_kill(t,0) can heuristically check if
't' is not alive as it may return ESRCH. But of course if t terminated
then it is entirely possible that the pthread_t value for it has been
reused. And if t is not going to detach we could be racing with its
termination anyway - so the heuristic may pass and we still hit a
low-level assert or SEGV.
What do people think? Do we try to deal with this at the bottom, or at
the top, or all the way through? (There's obviously a diminishing return
on effort versus benefit here.)
Thanks,
David
More information about the hotspot-runtime-dev
mailing list