RFR: 8257831: Suspend with handshakes [v8]

Daniel D.Daugherty dcubed at openjdk.java.net
Wed Apr 14 16:36:56 UTC 2021


On Wed, 14 Apr 2021 06:48:40 GMT, Robbin Ehn <rehn at openjdk.org> wrote:

>> A suspend request is done by handshaking thread target thread(s). When executing the handshake operation we know the target mutator thread is in a dormant state (as in safepoint safe state). We have a guarantee that it will check it's poll before leaving the dormant state. To stop the thread from leaving the the dormant state we install a second asynchronous handshake to be executed by the targeted thread. The asynchronous handshake will wait on a monitor while the thread is suspended. The target thread cannot not leave the dormant state without a resume request.
>> 
>> Per thread suspend requests are naturally serialized by the per thread HandshakeState lock (we can only execute one handshake at a time per thread).
>> Instead of having a separate lock we use this to our advantage and use HandshakeState lock for serializing access to the suspend flag and for wait/notify. 
>> 
>> Suspend:
>> Requesting thread -> synchronous handshake -> target thread
>> Inside synchronus handshake (HandshakeState lock is locked while
>> executing any handshake):
>> 	- Set suspended flag
>> 	- Install asynchronous handshake
>> 
>> Target thread -> tries to leave dormant state -> Executes handshakes
>> Target only executes asynchronous handshake:
>> 	- While suspended
>> 	- Go to blocked
>> 	- Wait on HandshakeState lock
>> 
>> Resume:
>> Resuming thread:
>> 	- Lock HandshakeState lock
>> 	- Clear suspended flag
>> 	- Notify HandshakeState lock
>> 	- Unlock HandshakeState lock
>> 
>> The "suspend requested" flag is an optimization, without it a dormant thread could be suspended and resumed many times and each would add a new asynchronous handshake. Suspend requested flag means there is already an asynchronous suspend handshake in queue which can be re-used, only the suspend flag needs to be set.
>> 
>> ----
>> Some code can be simplified or done in a smarter way but I refrained from doing such changes instead tried to keep existing code as is as far as possible. This concerns especially raw monitors.
>> 
>> ----
>> Regarding the changed test, the documentation says:
>> "If the calling thread is specified in the request_list array, this function will not return until some other thread resumes it."
>> 
>> But the code:
>>   LOG("suspendTestedThreads: before JVMTI SuspendThreadList");
>>   err = jvmti->SuspendThreadList(threads_count, threads, results);
>>   ...
>>   // Allow the Main thread to inspect the result of tested threads suspension	
>>   agent_unlock(jni);
>>   
>> The thread will never return from SuspendThreadList until resumed, so it cannot unlock with agent_unlock().
>> Thus main thread is stuck forever on:
>>   // Block until the suspender thread competes the tested threads suspension	
>>   agent_lock(jni);
>> 
>> And never checks and resumes the threads. So I removed that lock instead just sleep and check until all thread have the expected suspended state.
>> 
>> ----
>> 
>> This version already contains updates after pre-review comments from @dcubed-ojdk, @pchilano, @coleenp.
>> (Pre-review comments here:
>> https://github.com/openjdk/jdk/pull/2625)
>> 
>> ---- 
>> Testing t1-t8, nsk_jdi/nsk_jvmti/jdk_jdi/tck, KS24, RunThese and
>> combinations like running with -XX:ZCollectionInterval=0.01 -
>> XX:ZFragmentationLimit=0.
>> Running above some of above concurrently (load ~240), slow debug,
>> etc...
>
> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fixed flag undef dep + spelling error

I dug out the comments that I made in "resolved" sections where
the original comment wasn't actually resolved. I posted new comments
so let's see if those show up in the emails...

src/hotspot/share/runtime/handshake.cpp line 415:

> 413:   // Adds are done lock free and so is arming.
> 414:   // Calling this method with lock held is considered an error.
> 415:   assert(!_lock.owned_by_self(), "Lock should not be held");

I originally added this comment inside the "resolved" conversation and that kept
this comment from showing up. Let's try it as a new comment.

Doesn't that mean the comment on L415 needs updating?
Should it be something like:

// Synchronous adds are done lock free and so is arming, but some
// asynchronous adds are done when we already hold the lock.

I'm not sure about the "some asynchronous adds" part...
@dcubed-ojdk

src/hotspot/share/runtime/objectMonitor.cpp line 972:

> 970: 
> 971:       current->frame_anchor()->make_walkable(current);
> 972:       OrderAccess::storestore();

Repeating this comment since my original is marked resolved, but I'm not seeing
a new comment here. I originally pointed this out in the resolved conversation,
but that doesn't make the comment show up in the review emails.

Needs a comment explaining what the memory sync is for.

src/hotspot/share/runtime/objectMonitor.cpp line 1559:

> 1557:         if (_succ == current) {
> 1558:             _succ = NULL;
> 1559:             OrderAccess::fence();

My original comment is marked "fixed", but I don't see the new comment:

Please add a comment to this line:
OrderAccess::fence(); // always do a full fence when successor is cleared

-------------

Marked as reviewed by dcubed (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/3191


More information about the serviceability-dev mailing list