~ThreadInVMForHandshake() should call handle_special_runtime_exit_condition()

Fri May 10 01:25:55 UTC 2019

Actually I'm not convinced this is the right solution. 
handle_special_runtime_exit_condition as its name suggests is for 
exiting the runtime (VM) and implicitly returning to (or going to) Java. 
It is only checked by the ThreadInVMFromJava destructor, and the 
JavaCallWrapper. Separately we check for thread suspension when a thread 
enters the VM from native.

ThreadInVMForHandshake seems to shortcut the thread-state transition 
process. It doesn't seem to care what state it is coming from it just 
ducks into the VM for the handkshake, then restores the original state 
and continues. That doesn't seem right as per the current bug report.

However AFAICS the correct fix would be to:

a) check for suspension in ThreadInVMForHandshake when coming from 
_thread_in_native; and
b) check handle_special_runtime_exit_condition in 
~ThreadInVMForHandshake when the original state is _thread_in_Java

Now it may be that for the first case, as we're only processing a 
handshake, that it doesn't matter if we do the suspend check before or 
after the handshake. Hence that would mean we do (b) when the original 
state is thread_in_Java or _thread_in_native - which is what Richard 
originally proposed.

So I don't understand how the original:

+    if ((_original_state == _thread_in_Java || _original_state == 
_thread_in_native) &&
+        _thread->has_special_runtime_exit_condition()) {

became

+     if (_original_state != _thread_blocked_trans &&  _original_state 
!= _thread_in_vm_trans &&
+        _thread->has_special_runtime_exit_condition()) {

? Perhaps they are equivalent (I don't know all the possible initial 
states especially given the way this can be entered recursively!), but 
the former is much clearer than the latter.

Cheers,
David
-----

On 10/05/2019 10:05 am, David Holmes wrote:
> Hi Dan,
> 
> Many thanks for doing that detailed walkthrough! I also tried to verify 
> the conditions of the _thread_state as they are not what was originally 
> proposed. But once I saw the recursion I called it a night! The 
> recursion here makes me very nervous (it's preexisting of course) - this 
> stuff is hard enough to reason about with only one level of nesting. :(
> 
> There's no regression test as this was not seen in the wild only 
> hypothesised:
> 
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-May/034144.html 
> 
> 
> Thanks,
> David
> 
> On 10/05/2019 9:11 am, Daniel D. Daugherty wrote:
>> Richard,
>>
>> Very nice catch!
>>
>>
>> On 5/9/19 6:05 AM, Reingruber, Richard wrote:
>>> Hi,
>>>
>>> could I please get reviews and sponsoring for
>>>
>>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/2019/8223572/webrev.1/
>>
>> src/hotspot/share/runtime/interfaceSupport.inline.hpp
>>      Verified that your new conditional block for calling
>>      handle_special_runtime_exit_condition() and the
>>      conditions for the parameter match the code at the
>>      end of SafepointSynchronize::block(JavaThread *thread).
>>
>> I had to walk through the target thread call sequence:
>>
>> SafepointSynchronize::handle_polling_page_exception(JavaThread *thread)
>>    thread->safepoint_state()->handle_polling_page_exception
>>      SafepointMechanism::block_if_requested(thread())
>>        SafepointMechanism::block_if_requested_slow(thread)
>>          SafepointMechanism::block_or_handshake()
>>            // not a global poll
>>            thread->handshake_process_by_self()
>>              _handshake.process_by_self(this)
>>                process_self_inner(thread)
>>                  ThreadInVMForHandshake tivm(thread)
>>                  :  make_walkable()
>>                  :  set _thread_in_vm
>>                  _semaphore.wait_with_safepoint_check(thread)
>>                  : ThreadBlockInVM tbivm
>>                  :   make_walkable()
>>                  :   trans(_thread_in_vm, _thread_blocked)
>>                  : _impl.wait()
>>                  : <VM_thread signals>
>>                  : <VM_ThreadSuspend on target thread>
>>                  : ~ThreadBlockInVM
>>                  :   trans(_thread_blocked, _thread_in_vm)
>>                  :   : transition(_thread, from, to)
>>                  :   :   set _thread_blocked+1
>>                  :   : SafepointMechanism::block_if_requested(thread)
>>                  :   :   : 
>> SafepointMechanism::block_if_requested_slow(thread)
>>                  :   :   : SafepointMechanism::block_or_handshake()
>>                  :   :   :     if (global_poll()) {
>>                  :   :   : SafepointSynchronize::block(thread)
>>                  :   :   :         make_walkable()
>>                  :   :   :         set _thread_blocked
>>                  :   :   :         _wait_barrier->wait()
>>                  :   :   :         // wait is notified when the 
>> safepoint is over
>>                  :   :   :         restore _thread_blocked+1  // 
>> _thread_blocked_trans
>>                  :   :   :         skip the call to 
>> handle_special_runtime_exit_condition()
>>                  :   :   :           because state == 
>> _thread_blocked_trans
>>                  :   :   // return all the way back to transition()
>>                  :   :   set _thread_in_vm
>>                  ~ThreadInVMForHandshake
>>                    transition_back()
>>                      set _thread_in_vm_trans
>>                      SafepointMechanism::block_if_requested()
>>                      set _original_state
>> // Now we return all the way back to handle_polling_page_exception().
>> // If we came from a _thread_in_Java state, then we just returned to
>> // that state without self suspending.
>> // The fix adds this:
>>                      if (_original_state != _thread_blocked_trans &&
>>                          _original_state != _thread_in_vm_trans &&
>> _thread->has_special_runtime_exit_condition()) {
>> _thread->handle_special_runtime_exit_condition(
>>                            !_thread->is_at_poll_safepoint() &&
>>                            (_original_state != _thread_in_native_trans));
>>
>> And handle_special_runtime_exit_condition() does this:
>>
>>    bool do_self_suspend = is_external_suspend_with_lock();
>>    if (do_self_suspend && (!AllowJNIEnvProxy || this == 
>> JavaThread::current())) {
>>      frame_anchor()->make_walkable(this);
>>      java_suspend_self_with_safepoint_check();
>>    }
>>
>> Okay. Now I grok where the bug is and how your fix handles it.
>> Thumbs up on the fix itself!
>>
>> You don't mention how you tested this fix. You also don't say
>> whether you have a test that reproduces this issue.
>>
>> Obviously, we also need to hear from Robbin on this thread.
>>
>> Dan
>>
>>> Bug:    https://bugs.openjdk.java.net/browse/JDK-8223572
>>>
>>> A java thread T can continue executing bytecodes after it was 
>>> suspended using the vm operation
>>> VM_ThreadSuspend. This can happen in 
>>> HandshakeState::process_self_inner() when T becomes safepoint
>>> safe calling _semaphore.wait_with_safepoint_check(thread).
>>>
>>> Fix: poll has_special_runtime_exit_condition() and conditionally call
>>> handle_special_runtime_exit_condition() as it is done in 
>>> SafepointSynchronize::block(), too.
>>>
>>> Thanks, Richard.
>>>
>>> PS: see also mail thread 
>>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-May/034144.html 
>>>
>>