RFR(m): 8214271: Fast primitive to wake many threads
Robbin Ehn
robbin.ehn at oracle.com
Wed Dec 19 08:40:27 UTC 2018
Hi David,
On 2018-12-19 05:51, David Holmes wrote:
> Hi Robbin,
>
> Note this was focusing on the design of the API to ensure the semantics are
> clear and well-defined so that future users of the API know how to use it
> correctly and safely. It also then allows a review of the implementation to
> check back to the spec to see if it behaves as intended and checks all necessary
> constraints That's why I wanted it to be as complete as possible (spurious
> wakeups allowed? No - then lets document that) and now allow implementation
> specific choices unless unavoidable.
Sure!
Maybe add that to the wait() method, with something like:
+ // Guarantees not to return until disarm() is called,
+ // if called with currently armed tag (otherwise returns immediately).
+ // Implementation must guarantee no spurious wakeups.
123 // Guarantees to return if disarm() and wake() is called.
124 // Provides a trailing fence.
125 void wait(int barrier_tag) {
>
> I'll get to updated webrevs before I finish for the year on Friday.
Thanks!
/Robbin
>
> Thanks,
> David
>
> On 17/12/2018 8:39 pm, Robbin Ehn wrote:
>> Hi David,
>>
>> On 11/26/18 7:12 AM, David Holmes wrote:
>>> Hi Robbin,
>>>
>>> On 24/11/2018 2:55 am, Robbin Ehn wrote:
>>>> Forgot RFR in subject.
>>>
>>> Yep and now you have two different review threads happening in parallel
>>> unfortunately :(
>>>
>>> - src/hotspot/share/utilities/waitBarrier.hpp
>>>
>>> I'm studying just the WaitBarrierType API. Is this inherently tied to
>>> safepoint usage or intended as a general synchronization tool? As a general
>>> tool the API does not have clear semantics on how it should be used:
>>
>> There is two potential uses-cases in G1, which might require some changes.
>>
>>>
>>> - How do you communicate the current tag between the arming thread and the
>>> waiting threads? There would seem to be inherent races between arm(tag) and
>>> wait(tag) unless access to the tag itself is synchronized via another mechanism.
>>
>> As seen in gtest:
>> - arm tag
>> - publish tag
>>
>>>
>>> - What happens if wake() is called before disarm()? Should it be disallowed?
>>> (For other readers disarm() and wake() are distinct operations so that they
>>> fit better into the existing safepoint protocol which disarms the safepoint
>>> polling page at one spot and wakes blocked threads at another.)
>>
>> Yes, wake() have no effect if you have not disarmed.
>> There are asserts checking this.
>>
>>>
>>> - Should there be constraints that the same thread must arm/disarm/wake? It
>>> doesn't really make sense to allow these operations to happen in arbitrary
>>> order from multiple threads.
>>
>> Maybe, *thinking* if there is nice way to get the arming/waking thread into an
>> argument.
>>
>>>
>>> The semantics for re-arming with the same tag should be clearly set out not
>>> "implementation-defined". This should probably be a usage error IMHO - but it
>>> comes back to how the tag is expected to be used.
>>
>> If you have a way to detect that all threads have left the waitbarrier, there is
>> no problem with re-using the same tag. So yes, re-arming before that with the
>> same tag should be an error. But since the waitbarrier can't tell the difference
>> and do not know which tag it was last armed for I have no assert.
>>
>>>
>>> As Andrew mentioned there needs to be documentation regarding spurious
>>> wakeups or other "interruptions" at the API level. And I assume a blocked
>>> wait() only ever returns in response to a wake().
>>
>> Yes, if it's disarmed before. Since these are handled internally there are no
>> spurious wakeups, not sure why we want to document that, except inside the
>> implementation. E.g. semaphore.hpp does not document that it do not have it.
>>
>>>
>>> Nothwithstanding clarification of the above may I suggest the following
>>> rewrite of the API documentation for further clarity:
>>>
>>> /* Platform independent WaitBarrier API.
>>> An armed WaitBarrier prevents threads from advancing until the
>>> barrier is disarmed and the waiting threads woken. The barrier is
>>> armed by setting a non-zero value - the tag.
>>>
>>> Expected Usage:
>>> - Arming thread:
>>> tag = ...; // non-zero value
>>> barrier.arm(tag);
>>> <work>
>>> barrier.disarm();
>>> barrier.wake();
>>>
>>> - After arm(tag) returns any thread calling wait(tag) will block
>>> - After disarm() returns any subsequent calls to wait(tag) will not
>>> block
>>> - After wake() returns all blocked threads are unblocked and eligible
>>> to execute again
>>> - After calling disarm() and wake() the barrier is ready to be
>>> re-armed with a new tag
>>>
>>> - Waiting threads
>>> wait(tag); // don't execute following code unless 'safe'
>>> <work>
>>>
>>> - A call to wait(tag) will block if the barrier is armed with the
>>> value 'tag'; else it will return immediately.
>>> - A blocked thread is eligible to execute again once the barrier is
>>> disarmed and wake() has been called.
>>>
>>> A primary goal of the WaitBarrier implementation is to wake all waiting
>>> threads as fast, and as concurrently, as possible.
>>>
>>> */
>>
>> Updated, added <publish tag> after barrier.arm(tag);
>>
>>>
>>> Looking at the "implementation" in this file I'm unclear on the way the Linux
>>> specialization is being handled here. Why do we need the template? Can't this
>>> just be done the same way we do Semaphore?
>>
>> Because the gtest can test all implementations available on that platform.
>> Meaning I can test both implementation locally and compare performance etc..
>>
>> I'll post one update after I sorted out the other review parts also.
>>
>> Thanks, Robbin
>>
>>>
>>> More to follow.
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> /Robbin
>>>>
>>>> On 2018-11-23 17:51, Robbin Ehn wrote:
>>>>> Hi all, please review.
>>>>>
>>>>> When a safepoint is ended we need a way to get back to 100% utilization as
>>>>> fast
>>>>> as possible. 100% utilization means no idle cpu in the system if there is a
>>>>> JavaThread that could be executed. The traditional ways to wake many, e.g.
>>>>> semaphore, pthread_cond, is not implemented with a single syscall instead they
>>>>> typical do one syscall per thread to wake.
>>>>>
>>>>> This change-set contains that primitive, the WaitBarrier, and a gtest for it.
>>>>> No actual users, which is in coming patches.
>>>>>
>>>>> The WaitBarrier solves by doing a cooperative semaphore posting, threads woken
>>>>> will also post. On Linux we can instead directly use a futex and with one
>>>>> syscall wake all. Depending on how many threads and cpus the performance vary,
>>>>> but a good utilization of the machine, just on the edge of saturated, the
>>>>> time to reach 100% utilization is around 3 times faster with the
>>>>> WaitBarrier (where futex is faster than semaphore).
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~rehn/8214271/webrev/
>>>>>
>>>>> CR:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8214271
>>>>>
>>>>> Passes 100 iterations of gtest on our platforms, both fastdebug and release.
>>>>> And have been stable when used in safepoints (t1-8) (coming patches).
>>>>>
>>>>> Thanks, Robbin
More information about the hotspot-dev
mailing list