[foreign-memaccess+abi] RFR: 8270851: Logic for attaching/detaching native threads could be improved

Fri Jul 23 05:47:25 UTC 2021

On Fri, 16 Jul 2021 15:59:18 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

> For a complete description of the issue, please refer to:
> 
> https://bugs.openjdk.java.net/browse/JDK-8270851
> 
> This patch makes the logic for detaching native threads more lazy - by only doing the detach when a native thread has completed. This is achieved by using some thread local storage, which is used to keep track of the Java thread associated to a given native thread. If, by the time the thread local storage is destroyed, we see that a Java thread has been attached, we do a detach operation. This trick effectively minimizes the number of Thread instances created when interacting with multi-threaded native code.
> 
> This patch also tweaks the logic for attaching native threads to the VM by using the "daemon" attach variant. That is, native threads registered against the VM (because of Panama upcalls) should not prevent the JVM to shut down in an orderly fashion (in cases where the native threads might outlive the JVM).

Hi Jorn, Maurizio,

Sorry I didn't see these updates due to the skara email outage.

It is hard to try and go through everything point by point! :) And I don't have definitive answers for everything, I'm just raising awareness of potential problems.

First note `thread_local` is not allowed as a C++ language feature in hotspot per the style guide on allowed C++ features - so that is the first battle that needs to be fought (email to hotspot-dev). It may be an easy battle, it may not, that depends on how much detail we have about how a language based thread-local-storage (TLS) mechanism interacts with non-C++ threads and the compiler and library TLS mechanisms. I can easily imagine that the language, compiler, and library versions of TLS all hook into the same "termination hook" in the platform thread management code, but it remains an unknown what order the different categories of "destructor" will run. That fact that gcc __thread doesn't allow non-trivial object initialization and destruction itself suggests there is some significant different between __thread and C++ thread_local in that environment.

TLS is a very complex area and it is very easy for different uses of thread-locals to interact in bad ways given that the VM itself uses thread-local for dealing with Thread::current() (and normally uses two mechanisms: compiler TLS and library TLS, because only the library TLS is considered signal-safe, but compiler TLS is (usually) much faster). 

> Have we perhaps overlooked another mechanism for automatically detaching a thread on termination that already exists in HotSpot?

No there is no such mechanism. It is in fact quite problematic. We had to introduce a hack in the VM code (threadLocalStorage_posix.cpp) to allow a pthread TLS key destructor to detach a terminating thread - with a side effect that if a thread fails to detach then it may hang on termination. :(

If you look at what detaching a thread does it is non-trivial, with further calls to Java being made; and detaching depends on both the state of the thread (any Java frames still on the stack?) and the state of the VM (see next point).

The lifecycle management aspect of this is complex. The VM is load-once and never truly unloads within a process, but is not reusable once it has been "terminated". Also note that if the VM has terminated before a daemon thread tries to detach then the terminating thread will just hang in VMExit::block_if_vm_exited() - which is unlikely what the application wants. You can't just be a casual user of the JVM this way I'm afraid. The application really needs to understand how it uses the JVM and how/when it will "terminate" it. Daemon threads have a false allure to them - in practice because they can stop at any point in their execution when the VM terminates, they really have to be doing simple non-essential tasks.

> Looking at DetachCurrentThread, this seems problematic? Won't there be a memory leak ...

As long as there is one successful detach there is no leak, but otherwise yes it will leak the JavaThread and associated resources. But things could get worse because if a thread terminates without detaching, and is visible to some of the other API's (like monitoring and management, or JFR) then we may try to interact with that terminated thread.

I don't recall what the panama programming model is in regard to application use of native threads and the use of upcalls, but perhaps for the case at hand when threads are making many upcalls and the attach/detach overhead is significant, then the application should be responsible for attaching and later detaching these threads, with the upcall mechanism checking for an already attached thread?

Anyway ... if you still want to pursue this approach then as I said thread_local needs to be approved for use in hotspot (probably fine to use it in panama repo in the meantime).

Cheers,
David

-------------

PR: https://git.openjdk.java.net/panama-foreign/pull/570