RFR: 8244340: Handshake processing thread lacks yielding
David Holmes
david.holmes at oracle.com
Thu May 7 10:12:28 UTC 2020
On 7/05/2020 7:10 pm, Robbin Ehn wrote:
> Hi David,
>
> On 2020-05-07 09:58, David Holmes wrote:
>> Hi Robbin,
>>
>> Thanks for the additional explanations in other email. I still think
>> we have some scope for ensuring the different "spin" approaches are
>> consistent.
>
> We can see what we can do.
>
>>
>> On 7/05/2020 5:03 pm, Robbin Ehn wrote:
>>> Hi Patricio,
>>>
>>> On 2020-05-06 18:24, Patricio Chilano wrote:
>>>>
>>>> On 5/6/20 5:53 AM, Robbin Ehn wrote:
>>>>> Hi Patricio,
>>>>>
>>>>> Here is v2 inc:
>>>>> http://cr.openjdk.java.net/~rehn/8244340/v2/inc/
>>>>> Full:
>>>>> http://cr.openjdk.java.net/~rehn/8244340/v2/full/
>>>> Thanks Robbin, looks good!
>>>
>>> Thanks!
>>>
>>> But I notice that when we do direct handshakes with a JavaThread we
>>> should yield in blocked state. This only effects the corner case when
>>> JavaThread A handshakes B and C handshakes A, C can complete the
>>> handshake faster if A would yield in block state.
>>
>> I'm a bit surprised you chose to do this - the thread-state transition
>> logic would likely exceed the duration of these microsleeps and end up
>> executing even more code in the thread we want to go off-proc. That's
>> why these sleep routines don't themselves change the thread state etc.
>
> The block transition takes ~500 cycles to go into blocked and ~300 to
> leave block. The fastest time for os::naked_short_nanosleep I can see is
> around 70000 cycles, 99% takes 160000 cycles. So over-head of calling
> os::naked_short_nanosleep in block state is low.
Okay - I can't argue with cycles.
>>
>> I also can't quite get my head around what kind of nested processing
>> this can now lead to (with regard to safepoints and other handshakes
>> on the thread doing this handshake).
>
> We already have that here:
> if (SafepointMechanism::should_block(self)) {
> ThreadBlockInVM tbivm(self);
> }
> cbo.process();
Okay so aren't we now doubling up. If process() doesn't show success
then we will eventually hit the wait_blocked. So no need for the
explicit check? I guess it doesn't make too much difference but the
placement of these checks seems somewhat arbitrary.
David
-----
> And we actually must do that since otherwise:
> A handshakes B
> B handshakes A
> If A and B never becomes safe, we have a deadlock situation.
>
> Thanks, Robbin
>
>>
>> Thanks,
>> David
>> -----
>>
>>> v3 inc:
>>> http://cr.openjdk.java.net/~rehn/8244340/v3/inc/
>>>
>>> v3 full:
>>> http://cr.openjdk.java.net/~rehn/8244340/v3/full/
>>>
>>> Thanks, Robbin
>>>
>>>>
>>>>
>>>> Patricio
>>>>> On 2020-05-05 21:44, Patricio Chilano wrote:
>>>>>> Hi Robbin,
>>>>>>
>>>>>> Thanks for fixing this!
>>>>>
>>>>> Thanks!
>>>>>
>>>>>>
>>>>>> I tested the patch in a Windows Server 2012 virtual machine (not
>>>>>> VirtualBox), with two virtual cpus assigned. Without the patch,
>>>>>> running the simple HelloWorld program takes about 2.5seconds most
>>>>>> of the times when setting “/affinity 1” and around 300ms when
>>>>>> setting “affinity /FF”. With the patch I get the same timing for
>>>>>> both cases, about 300ms. It would be great if Miklos can also
>>>>>> check it against his environment to see if his issue got fixed.
>>>>>
>>>>> Great. They don't seem to be building the JDK them self.
>>>>>
>>>>>>
>>>>>> Only small comments in handshake.cpp:
>>>>>> - In VM_HandshakeOneThread, last log_handshake_info() should be
>>>>>> (pr == HandshakeState::_success) instead of by_vm_thread. Maybe
>>>>>> just define "HandshakeState::ProcessResult pr" instead of
>>>>>> by_vm_thread. Same with Handshake::execute_direct() to get rid of
>>>>>> by_handshaker.
>>>>>
>>>>> Fixed.
>>>>>
>>>>>> - In Handshake::execute_direct() inside the "if" statement we
>>>>>> could just break after verifying _success.
>>>>>
>>>>> The if statement is removed.
>>>>>
>>>>>> - I would remove _spin_time_ns and just keep _max_spin_time_ns,
>>>>>> since both are only set at construction only.
>>>>>
>>>>> Fixed.
>>>>>
>>>>>> - I think a better name for _last_wait_ns should be
>>>>>> last_spin_start_ns because it is updated also when we see the
>>>>>> state changed.
>>>>>
>>>>> Fixed.
>>>>>
>>>>> Thanks, Robbin
>>>>>
>>>>>>
>>>>>> Thanks Robbin!
>>>>>>
>>>>>> Patricio
>>>>>> On 5/5/20 10:08 AM, Robbin Ehn wrote:
>>>>>>> Hi all, please review.
>>>>>>>
>>>>>>> As the issue describes there are problems with this infinitely
>>>>>>> spinning.
>>>>>>> To avoid any performance regression we do not yield while we do
>>>>>>> or see
>>>>>>> progress. But when we enter a period of no progression we
>>>>>>> 'quickly' give
>>>>>>> up.
>>>>>>>
>>>>>>> Issue:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8244340
>>>>>>>
>>>>>>> Code:
>>>>>>> http://cr.openjdk.java.net/~rehn/8244340/v1/webrev/
>>>>>>>
>>>>>>> In the pathologically case we can easily see 10x improvement
>>>>>>> starting the VM.
>>>>>>> I have found no performance regressions or performance gains in our
>>>>>>> standard benchmark setups.
>>>>>>> But I see much better average completion times for handshakes.
>>>>>>> For example specJVM2008 serial with ZGC shows 4x faster time to
>>>>>>> completion for handshakes in my setup, but little or no effect on
>>>>>>> score
>>>>>>> (ops/m).
>>>>>>>
>>>>>>> Passes t1-3.
>>>>>>>
>>>>>>> Thanks, Robbin
>>>>>>
>>>>
More information about the hotspot-runtime-dev
mailing list