RFR: 8254263: Remove special_runtime_exit_condition() check from ~ThreadInVMForHandshake()

Tue Oct 13 13:58:24 UTC 2020

Hi David,

On 10/12/20 11:35 PM, David Holmes wrote:
> Hi Patricio,
>
> On 12/10/2020 2:59 pm, Patricio Chilano Mateo wrote:
>> On Sun, 11 Oct 2020 22:09:30 GMT, David Holmes <dholmes at openjdk.org> 
>> wrote:
>>
>>>> Hi all,
>>>>
>>>> Please review the following patch that removes the call to 
>>>> handle_special_runtime_exit_condition() in
>>>> ~ThreadInVMForHandshake(). This check was added by 8223572 to 
>>>> prevent a JavaThread that was suspended while trying to
>>>> process a handshake to transition back to java without noticing it 
>>>> was suspended. Since 8238761, a JavaThread executing
>>>> HandshakeState::process_by_self() will never become safe. It comes 
>>>> from an unsafe state and remains unsafe, so it
>>>> cannot be suspended during that time. Removing this check also 
>>>> removes one of the reasons
>>>> SafepointMechanism::process_if_requested() is recursive (the other 
>>>> one remains SafepointSynchronize::block()). Tested
>>>> in mach5, tiers 1-7.  Thanks, Patricio
>>>
>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp line 146:
>>>
>>>> 144:
>>>> 145:   ~ThreadInVMForHandshake() {
>>>> 146:     assert(_thread->thread_state() == _thread_in_vm, "should 
>>>> only call when leaving VM after handshake");
>>>
>>> Can we also assert `!_thread->has_special_runtime_exit_condition()`? 
>>> Or at least that the value of this has not changed
>>> between the TIVFH constructor and destructor.
>>
>> has_special_runtime_exit_condition() checks for async exceptions, 
>> external_suspend and JFR suspend. The last two can
>> actually be set while we are in the handshake. It doesn't mean that 
>> the caller will consider the target suspended, but
>> the _suspend_flags might be different at construction and destruction 
>> of TIVFH. 
>
> Ah right! Oh well :)
>
>> As for async exceptions, I realized we
>> have a similar issue as the one described in 8223572 when calling 
>> handle_polling_page_exception() on a return poll:
>>
>> - T calls ThreadSafepointState::handle_polling_page_exception() 
>> because another thread called T.stop() (which we
>>    implement with a handshake)
>> - T calls SafepointMechanism::process_if_requested(thread())
>> - T calls HandshakeState::process_self_inner() and processes the 
>> handshake which calls set_pending_async_exception()
>> - T returns from SafepointMechanism::process_if_requested(thread()) 
>> and goes back to java without calling
>>    check_and_handle_async_exceptions() to throw the ThreadDeath() 
>> exception
>>
>> I'm not sure if delaying the throw of ThreadDeath is allowed, so if 
>> we need to keep the current behaviour I can add a
>> call to check_and_handle_async_exceptions():
>
> To some extent, given it is a race, a delay is permissible, but I find 
> it highly undesirable. Our biggest problems with async exceptions 
> lately is the unpredictability of their actually getting propagated. 
> So we should fix this.
>
>> @@ -987,6 +987,7 @@ void 
>> ThreadSafepointState::handle_polling_page_exception() {
>>       // Process pending operation
>>       SafepointMechanism::process_if_requested(thread());
>> +   thread()->check_and_handle_async_exceptions();
>
> Why directly in handle_polling_page_exception() rather than restoring 
> it to the ~ThreadInVMForHandshake() ?
Having the check in ~TIVFH saves us from missing the async exception for 
this particular handle_polling_page_exception() on poll return case 
because if there is and InstallAsyncExceptionClosure handshake then the 
target will execute it and will notice the async exception on the 
~TIVFH. But otherwise it's not useful. If that handshake happened while 
the target was safe (let's say _thread_blocked somewhere) we will never 
hit that check since the handshake would be executed by the handshaker. 
So, having the check directly in handle_polling_page_exception() makes 
the code in line with other places where we check for async exceptions 
right before transitioning back to java. In fact we should also check 
for external_suspend and not rely on the same 
special_runtime_exit_condition() check at the end of SS:block() as I 
discussed with Richard, but I can leave that for a different change.

Thanks,
Patricio
> Thanks,
> David
> -----
>
>>       // restore oop result, if any
>>
>> -------------
>>
>> PR: https://git.openjdk.java.net/jdk/pull/577
>>