RFR: 8244340: Handshake processing thread lacks yielding
Robbin Ehn
robbin.ehn at oracle.com
Mon May 11 11:07:39 UTC 2020
Hi Dan,
v4 inc:
http://cr.openjdk.java.net/~rehn/8244340/v4/inc/
v4 full:
http://cr.openjdk.java.net/~rehn/8244340/v4/full/
Truncated:
On 2020-05-08 16:14, Daniel D. Daugherty wrote:
>
> The above helps a lot. Thanks!
>
> So the loop termination condition is when all counters have been
> accumulated in _no_operation? That tells you that everyone is done,
> correct?
The handshake operation have an internal counter which is set to the
number of emitted handshakes. Every time a handshake is performed that
counter is decremented. When it reaches zero we are done.
If we would do an extra iteration over all threads then the
ProcessesResult for all threads would be _no_operation.
We could change from do-while loop to a while a, since sometimes that
counter is zero before the first iteration.
But it happens when we have few threads, and thus the iteration is
short, but a tiny speedup in that case (maybe startup).
>>> src/hotspot/share/runtime/handshake.cpp
>>> L66: class HandshakeSpinYield : public StackObj {
>>> It would be good to have a short paragraph to explain the
>>> high lights
>>> of your handshake spin yield algorithm.
>>
>> Sure!
>
> I look forward to seeing it. :-)
Is above what you had in mine?
>> That is why it's called cbo, CPUBackOFF which was the name of the
>> class during prototyping :)
>
> I somehow knew it couldn't be Congressional Budget Office... :-)
:)
Thanks, Robbin
>
>
>>
>> Changed to hsy.
>>
>>>
>>> src/hotspot/share/runtime/thread.hpp
>>> No comments.
>>>
>>>
>>> Okay, I just re-read all the other code review comments for the third
>>> time.
>>> I still don't understand your HandshakeSpinYield algorithm. Maybe it
>>> is just
>>> me and I'm not seeing something obvious because my head is still
>>> stuck in
>>> async monitor deflation land... :-)
>>
>> Let me know of above help or not?
>
> Definitely helpful.
>
> Dan
>
>
>>
>> Thanks, Robbin
>>
>>>
>>> Dan
>>>
>>>
>>>
>>> On 5/7/20 3:03 AM, Robbin Ehn wrote:
>>>> Hi Patricio,
>>>>
>>>> On 2020-05-06 18:24, Patricio Chilano wrote:
>>>>>
>>>>> On 5/6/20 5:53 AM, Robbin Ehn wrote:
>>>>>> Hi Patricio,
>>>>>>
>>>>>> Here is v2 inc:
>>>>>> http://cr.openjdk.java.net/~rehn/8244340/v2/inc/
>>>>>> Full:
>>>>>> http://cr.openjdk.java.net/~rehn/8244340/v2/full/
>>>>> Thanks Robbin, looks good!
>>>>
>>>> Thanks!
>>>>
>>>> But I notice that when we do direct handshakes with a JavaThread we
>>>> should yield in blocked state. This only effects the corner case when
>>>> JavaThread A handshakes B and C handshakes A, C can complete the
>>>> handshake faster if A would yield in block state.
>>>>
>>>> v3 inc:
>>>> http://cr.openjdk.java.net/~rehn/8244340/v3/inc/
>>>>
>>>> v3 full:
>>>> http://cr.openjdk.java.net/~rehn/8244340/v3/full/
>>>>
>>>> Thanks, Robbin
>>>>
>>>>>
>>>>>
>>>>> Patricio
>>>>>> On 2020-05-05 21:44, Patricio Chilano wrote:
>>>>>>> Hi Robbin,
>>>>>>>
>>>>>>> Thanks for fixing this!
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>>>
>>>>>>> I tested the patch in a Windows Server 2012 virtual machine (not
>>>>>>> VirtualBox), with two virtual cpus assigned. Without the patch,
>>>>>>> running the simple HelloWorld program takes about 2.5seconds most
>>>>>>> of the times when setting “/affinity 1” and around 300ms when
>>>>>>> setting “affinity /FF”. With the patch I get the same timing for
>>>>>>> both cases, about 300ms. It would be great if Miklos can also
>>>>>>> check it against his environment to see if his issue got fixed.
>>>>>>
>>>>>> Great. They don't seem to be building the JDK them self.
>>>>>>
>>>>>>>
>>>>>>> Only small comments in handshake.cpp:
>>>>>>> - In VM_HandshakeOneThread, last log_handshake_info() should be
>>>>>>> (pr == HandshakeState::_success) instead of by_vm_thread. Maybe
>>>>>>> just define "HandshakeState::ProcessResult pr" instead of
>>>>>>> by_vm_thread. Same with Handshake::execute_direct() to get rid of
>>>>>>> by_handshaker.
>>>>>>
>>>>>> Fixed.
>>>>>>
>>>>>>> - In Handshake::execute_direct() inside the "if" statement we
>>>>>>> could just break after verifying _success.
>>>>>>
>>>>>> The if statement is removed.
>>>>>>
>>>>>>> - I would remove _spin_time_ns and just keep _max_spin_time_ns,
>>>>>>> since both are only set at construction only.
>>>>>>
>>>>>> Fixed.
>>>>>>
>>>>>>> - I think a better name for _last_wait_ns should be
>>>>>>> last_spin_start_ns because it is updated also when we see the
>>>>>>> state changed.
>>>>>>
>>>>>> Fixed.
>>>>>>
>>>>>> Thanks, Robbin
>>>>>>
>>>>>>>
>>>>>>> Thanks Robbin!
>>>>>>>
>>>>>>> Patricio
>>>>>>> On 5/5/20 10:08 AM, Robbin Ehn wrote:
>>>>>>>> Hi all, please review.
>>>>>>>>
>>>>>>>> As the issue describes there are problems with this infinitely
>>>>>>>> spinning.
>>>>>>>> To avoid any performance regression we do not yield while we do
>>>>>>>> or see
>>>>>>>> progress. But when we enter a period of no progression we
>>>>>>>> 'quickly' give
>>>>>>>> up.
>>>>>>>>
>>>>>>>> Issue:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8244340
>>>>>>>>
>>>>>>>> Code:
>>>>>>>> http://cr.openjdk.java.net/~rehn/8244340/v1/webrev/
>>>>>>>>
>>>>>>>> In the pathologically case we can easily see 10x improvement
>>>>>>>> starting the VM.
>>>>>>>> I have found no performance regressions or performance gains in our
>>>>>>>> standard benchmark setups.
>>>>>>>> But I see much better average completion times for handshakes.
>>>>>>>> For example specJVM2008 serial with ZGC shows 4x faster time to
>>>>>>>> completion for handshakes in my setup, but little or no effect
>>>>>>>> on score
>>>>>>>> (ops/m).
>>>>>>>>
>>>>>>>> Passes t1-3.
>>>>>>>>
>>>>>>>> Thanks, Robbin
>>>>>>>
>>>>>
>>>
>
More information about the hotspot-runtime-dev
mailing list