RFR: 8238761: Asynchronous handshakes [v5]

David Holmes dholmes at openjdk.java.net
Wed Sep 23 04:13:04 UTC 2020


On Tue, 22 Sep 2020 07:54:57 GMT, Robbin Ehn <rehn at openjdk.org> wrote:

>> This patch implements asynchronous handshake, which changes how handshakes works by default. Asynchronous handshakes
>> are target only executed, which they may never be executed. (target may block on socket for the rest of VM lifetime)
>> Since we have several use-cases for them we can have many handshake pending. (should be very rare) To be able handle an
>> arbitrary amount of handshakes this patch adds a per JavaThread queue and heap allocated HandshakeOperations. It's a
>> singly linked list where you push/insert to the end and pop/get from the front. Inserts are done via CAS on first
>> pointer, no lock needed. Pops are done while holding the per handshake state lock, and when working on the first
>> pointer also CAS.  The thread grabbing the handshake state lock for a JavaThread will pop and execute all handshake
>> operations matching the filter. The JavaThread itself uses no filter and any other thread uses the filter of everything
>> except asynchronous handshakes. In this initial change-set there is no need to do any other filtering. If needed
>> filtering can easily be exposed as a virtual method on the HandshakeClosure, but note that filtering causes handshake
>> operation to be done out-order. Since the filter determins who execute the operation and not the invoked method, there
>> is now only one method to call when handshaking one thread.  Some comments about the changes:
>> - HandshakeClosure uses ThreadClosure, since it neat to use the same closure for both alla JavThreads do and Handshake
>>   all threads. With heap allocating it cannot extends StackObj. I tested several ways to fix this, but those very much
>>   worse then this.
>> 
>> - I added a is_handshake_safe_for for checking if it's current thread is operating on itself or the handshaker of that
>>   thread.
>> 
>> - Simplified JVM TI with a JvmtiHandshakeClosure and also made them not needing a JavaThread when executing as a
>>   handshaker on a JavaThread, e.g. VM Thread can execute the handshake operation.
>> 
>> - Added WB testing method.
>> 
>> - Removed VM_HandshakeOneThread, the VM thread uses the same call path as direct handshakes did.
>> 
>> - Changed the handshake semaphores to mutex to be able to handle deadlocks with lock ranking.
>> 
>> - VM_HandshakeAllThreadsis still a VM operation, since we do support half of the threads being handshaked before a
>>   safepoint and half of them after, in many handshake all operations.
>> 
>> - ThreadInVMForHandshake do not need to do a fenced transistion since this is always a transistion from unsafe to unsafe.
>> 
>> - Added NoSafepointVerifyer, we are thinking about supporting safepoints inside handshake, but it's not needed at the
>>   moment. To make sure that gets well tested if added the NoSafepointVerifyer will raise eyebrows.
>> 
>> - Added ttyLocker::break_tty_lock_for_safepoint(os::current_thread_id()); due to lock rank.
>> 
>> - Added filtered queue and gtest for it.
>> 
>> Passes multiple t1-8 runs.
>> Been through some pre-reviwing.
>
> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update after Coleen

No significant comments. All my concerns relate to naming and terminology, where I think there is scope for quite a bit
of tidying up. Thanks.

src/hotspot/share/runtime/handshake.hpp line 45:

> 43: // a single target/direct handshake or not, by the JavaThread that requested the
> 44: // handshake or the VMThread respectively.
> 45: class HandshakeClosure : public ThreadClosure, public CHeapObj<mtThread> {

Is use of multiple inheritance allowed within hotspot code?

src/hotspot/share/runtime/handshake.hpp line 51:

> 49:   virtual ~HandshakeClosure() {}
> 50:   const char* name() const    { return _name; }
> 51:   virtual bool is_asynch()    { return false; };

I thought "asynch" has already been renamed to drop the 'h' everywhere?

src/hotspot/share/runtime/handshake.hpp line 78:

> 76:   FilterQueue<HandshakeOperation*> _queue;
> 77:   Mutex   _lock;
> 78:   Thread* _active_handshaker;

To be clear, the _handshakee is the always the target JavaThread, while the _active_handshaker, is the thread that is
actually executing the handshake operation (ie do_thread). If so can you add comments on these declarations to clarify
that. Thanks.

src/hotspot/share/runtime/handshake.hpp line 90:

> 88:   void add_operation(HandshakeOperation* op);
> 89:   HandshakeOperation* pop_for_self();
> 90:   HandshakeOperation* pop_for_processor();

What is "processor" in this context - the active handshaker? Can we not introduce yet another piece of terminology
here. We should have consistency of naming when it comes to "self" and others. ie. we have

pop_for_self()

but

has_operation()

rather than

has_operation_for_self()

If we made the "self" case explicit then we could leave the not-self case implicit e.g.

pop_for_self();  // Called by handshakee only
pop();  // Called by handshaker or VMThread
has_operation_for_self(); // Is there an operation that can be executed by the handshakee itself
has_operation();           // Is there an operation that can be executed by the handshaker or VMThread
We can then stop using "processor" in other places as well.

src/hotspot/share/runtime/handshake.hpp line 96:

> 94:     return !_queue.is_empty();
> 95:   }
> 96:   bool block_for_operation() {

should_block_for_operation() ? Though looking at the loop that uses this the name doesn't seem right as we are not
blocking but processing the operation. ??

src/hotspot/share/runtime/handshake.hpp line 97:

> 95:   }
> 96:   bool block_for_operation() {
> 97:     return !_queue.is_empty() || _lock.is_locked();

I really don't understand the is_locked() check in this condition. ??
And the check for !empty is racy, so how do we avoid missing an in-progress addition?

src/hotspot/share/runtime/handshake.cpp line 44:

> 42:  protected:
> 43:   HandshakeClosure*   _handshake_cl;
> 44:   int32_t             _pending_threads;

Not new but the meaning of _pending_threads is unclear - please add a descriptive comment.

src/hotspot/share/runtime/handshake.cpp line 63:

> 61: };
> 62:
> 63: class AsyncHandshakeOperation : public HandshakeOperation {

This doesn't quite make sense. If you have an AsyncHandshakeOperation as a distinct subclass then it should not be
possible for is_async() on a HandshakeOperation to return true - but it can because it can be passed an
AsyncHandshakeClosure when constructed. If you want async and non-async operations to be distinct types then you will
need to restrict how the base class is constructed, and provide a protected constructor that just takes an
AsyncHandShakeClosure.

src/hotspot/share/runtime/handshake.cpp line 195:

> 193: }
> 194:
> 195: static void log_handshake_info(jlong start_time_ns, const char* name, int targets, int requester_executed, const
> char* extra = NULL) {

It is not clear what "requester_executed" actually means here - why is this an int? what does it represent?
Again we have new terminology "requester" - is that the handshakee or ???

src/hotspot/share/runtime/handshake.cpp line 244:

> 242:         // A new thread on the ThreadsList will not have an operation,
> 243:         // hence it is skipped in handshake_try_process.
> 244:         HandshakeState::ProcessResult pr = thr->handshake_state()->try_process(_op);

To be clear on what can be happening here ... as the VMThread has to loop through all threads first to initiate the
handshake, by the time it starts trying to process the op, the target threads may have already done it themselves.
Additionally while looping through all threads, a thread that has not yet been handshaked could try to handshake with a
thread the VMThread has already processed, and so it could also execute the operation before the VMThread gets to it.

src/hotspot/share/runtime/handshake.cpp line 356:

> 354: }
> 355:
> 356: HandshakeState::HandshakeState(JavaThread* thread) :

s/thread/target/ for clarity

src/hotspot/share/runtime/handshake.cpp line 412:

> 410:     if (op != NULL) {
> 411:       assert(op->_target == NULL || op->_target == Thread::current(), "Wrong thread");
> 412:       assert(_handshakee == Thread::current(), "Wrong thread");

You already asserted this at line 400.

src/hotspot/share/runtime/thread.hpp line 1358:

> 1356:   HandshakeState* handshake_state() { return &_handshake; }
> 1357:
> 1358:   // A JavaThread can always safely operate on it self and other threads

s/it self/itself/

src/hotspot/share/runtime/thread.hpp line 1359:

> 1357:
> 1358:   // A JavaThread can always safely operate on it self and other threads
> 1359:   // can do it safely it if they are the active handshaker.

s/it if/if/

src/hotspot/share/utilities/filterQueue.hpp line 32:

> 30:
> 31: template <class E>
> 32: class FilterQueue {

A brief description of the class would be good. It is basically a FIFO queue but with the ability to skip nodes that
match a given "filter" criteria.

src/hotspot/share/utilities/filterQueue.hpp line 34:

> 32: class FilterQueue {
> 33:  private:
> 34:   class FilterQueueNode : public CHeapObj<mtInternal> {

The Filter in FilterQueueNode is redundant given this is a nested type. Node would suffice.

src/hotspot/share/utilities/filterQueue.hpp line 56:

> 54:   }
> 55:
> 56:   // MT-safe

Not sure where our previous discussion is on this but these posix-style MT-safe labels don't really halp for this kind
of abstract data type API. Please briefly explain the thread-safety properties of add and pop.

src/hotspot/share/utilities/filterQueue.hpp line 57:

> 55:
> 56:   // MT-safe
> 57:   void add(E data);

It would be more regular naming to use add/remove or push/pop rather than add/pop.

-------------

Changes requested by dholmes (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/151


More information about the hotspot-runtime-dev mailing list