RFR: 8257831: Suspend with handshakes [v4]

Robbin Ehn rehn at openjdk.java.net
Mon Apr 12 06:50:21 UTC 2021


On Sat, 10 Apr 2021 13:56:41 GMT, Daniel D. Daugherty <dcubed at openjdk.org> wrote:

>> Hi Dan,
>> 
>> thanks for picking up my question!
>> 
>>> `HandshakeState::suspend()` is a synchronous handshake and adding the
>>> handshake to the queue is lock free, but the execution of the synchronous
>>> handshake itself requires a `HandshakeState::claim_handshake()` call which
>>> does acquire the lock in question.
>> 
>> My point would be that the attempt to execute the synchronous handshake for
>> suspending a thread that is just about to call HandshakeState::thread_exit()
>> cannot make progress (blocks) while the target thread is not safe
>> (_thread_in_vm).
>> 
>> A synchronous handshake requires the target thread to be in a safe state for the
>> requester to execute the handshake operation.  When executing
>> HandshakeState::thread_exit() the suspendee is _thread_in_vm. And the requester
>> will find it to be `_not_safe` when calling `possibly_can_process_handshake()`
>> before calling `HandshakeState::claim_handshake()` or when calling
>> `can_process_handshake()` afterwards. In both cases try_process() returns with
>> failure _not_safe and the lock is not held.
>> 
>> ++
>>  546	  if (!possibly_can_process_handshake()) {
>>  547	    // JT is observed in an unsafe state, it must notice the handshake itself
>>  548	    return HandshakeState::_not_safe;
>>  549	  }
>>  550	
>>  551	  // Claim the mutex if there still an operation to be executed.
>>  552	  if (!claim_handshake()) {
>>  553	    return HandshakeState::_claim_failed;
>>  554	  }
>>  555	
>>  556	  // If we own the mutex at this point and while owning the mutex we
>>  557	  // can observe a safe state the thread cannot possibly continue without
>>  558	  // getting caught by the mutex.
>>  559	  if (!can_process_handshake()) {
>>  560	    _lock.unlock();
>>  561	    return HandshakeState::_not_safe;
>>  562	  }
>> 
>> So isn't being unsafe sufficient to sync with suspend requests?
>
> Interesting point that I didn't pick up from your previous comment.
> Thanks for making it more clear for me. I need to mull on it for a bit.

Technically you are correct.
I have tested to remove this is previously version and all tests passes fine.
The reason I kept it is because the suspender have identified a thread for suspension and deemed it suspendable, so we play nice.
To know why suspend failed the suspender must check if thread is exiting after a failed suspend. (originally I had a bug here which caused me to wrongfully introduce this from the beginning)
I'll remove it, since it simplifies the code and David's comments about this code is now out-of-line can be fixed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3191


More information about the serviceability-dev mailing list