RFC: more robust handling of terminated but still attached threads
David Holmes
david.holmes at oracle.com
Tue Jul 3 12:09:26 UTC 2018
On 3/07/2018 9:28 PM, Florian Weimer wrote:
> On 07/03/2018 11:21 AM, David Holmes wrote:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>
>> We hit asserts or trigger SEGVs when we try to operate on a native
>> thread ID for a JNI-attached thread that has actually terminated but
>> which did not detach first. It still appears in the threadsList and we
>> try to process it during DumpOnExit (but there are probably other
>> operations that could run into this in the general case).
>
> This bug is not public.
Sorry I'll try to get that changed. It's an issue with some of the newly
opened tests in:
vmTestbase/nsk/jvmti/scenarios/jni_interception/
when run with FlightRecorder set to use DumpOnExit. I will of course fix
the tests.
> The use case isn't entirely clear to me. If you are sufficiently
> unlucky, the memory behind a pthread_t value is simply gone after thread
> exit (and potentially TCB/thread stack reclamation in the thread
> library). On glibc, this includes the internal TID, which is required
> for pthread_kill (thr, 0) actually sending the signal.
IIUC pthread_kill(thr,0) never sends any signal, but may lookup the id
to see if it is valid. I understand there's no guarantee and that there
is an inherent race regardless.
> I'm not familiar with the Hotspot run-time and why it needs to do this.
> Can you deregister the thread from a thread directory once it exits
> (using one of the TLS variants with a destructor)? Or is the concern
> there that the destructor would not run late enough?
The issue is native process threads that attach to the VM through JNI
but then don't detach themselves before terminating. While it may be
possible to create such a mechanism as you describe it goes way beyond
what I'm trying to do here and violates a basic principle that we try to
interfere as little as possible with threads that attach to the VM
directly (rather than being created by the VM). There was also a rather
complex bug involving native threads that themselves provided such a TLS
destructor (to detach themselves) and the VMs own (fairly recent) use of
TLS.
All I'm looking at is some basic robustness if the VM encounters such a
thread (for which all the VM data structures remain intact - and
effectively leak) so that we don't assert or crash when we do invoke a
pthread function (pthread_getcpuclockid is the one in question in the
bug report).
It may be that it isn't really worth trying to do this given it can't be
100% reliable anyway.
Thanks,
David
> Thanks,
> Florian
More information about the hotspot-runtime-dev
mailing list