RFC: more robust handling of terminated but still attached threads

David Holmes david.holmes at oracle.com
Tue Jul 3 09:21:56 UTC 2018


Bug: https://bugs.openjdk.java.net/browse/JDK-8205878

We hit asserts or trigger SEGVs when we try to operate on a native 
thread ID for a JNI-attached thread that has actually terminated but 
which did not detach first. It still appears in the threadsList and we 
try to process it during DumpOnExit (but there are probably other 
operations that could run into this in the general case).

Fixing the tests is easy. But the more general question is how to make 
the VM code more robust in the face of this situation.

At the lowest level we can watch for ESRCH from pthread_* functions and 
try to program in alternate logic that gives some "result" for that thread.

At higher-level we may be able to heuristically guess that the native 
thread has terminated and so skip it in ALL_JAVA_THREADS and similar 
constructors. For example pthread_kill(t,0) can heuristically check if 
't' is not alive as it may return ESRCH. But of course if t terminated 
then it is entirely possible that the pthread_t value for it has been 
reused. And if t is not going to detach we could be racing with its 
termination anyway - so the heuristic may pass and we still hit a 
low-level assert or SEGV.

What do people think? Do we try to deal with this at the bottom, or at 
the top, or all the way through? (There's obviously a diminishing return 
on effort versus benefit here.)

Thanks,
David


More information about the hotspot-runtime-dev mailing list