RFR(m): 8214271: Fast primitive to wake many threads
Robbin Ehn
robbin.ehn at oracle.com
Fri Dec 21 09:45:54 UTC 2018
Hi David,
On 2018-12-21 04:17, David Holmes wrote:
>
> Got it - subtle.
>
> Further this sounds like a race that could lead to bugs if not used very
> carefully ie. you can't assume between disarm() and wake() that all threads are
> blocked.
I didn't realize how subtle this is. I think your original comment that
disarm/wake should be one operation was spot on.
Investigating... thinking... testing... yes I think this will work, fixed!
Sorry for not looking more into this before.
>
> I think perhaps this needs to be expanded to make this more obvious:
>
> 68 // - A call to wait(tag) will block if the barrier is armed with the value
> 69 // 'tag'; else it will return immediately.
> 70 // - A blocked thread is eligible to execute again once the barrier is
> 71 // disarmed and wake() has been called.
> + - A call to wait(tag) that would block if it continued, but instead
> + is descheduled, may return immediately if scheduled after a
> + call to disarm(), but before the call to wake().
>
> It also made me realize that in the general case (not when used with safepoints
> I think due to other state checks) a wake() may stall due to threads with a
> previous tag entering the wait() late.
I added a double checking in the semaphore version, this means both
implementation should have progress guarantee.
Making this v5 a bit large due to a lot of comments being changed.
Inc:
http://cr.openjdk.java.net/~rehn/8214271/5/inc/webrev/
Full:
http://cr.openjdk.java.net/~rehn/8214271/5/full/webrev/
gtest passes thousands of loops locally and hundreds in mach5.
Thanks, Robbin
>
> Thanks,
> David
>
>>>
>>> s/Implementation/Implementations/
>>
>> Fixed
>>
>>>
>>> The fourth line is no longer needed.
>>
>> Above is the reason I would like to keep the fourth line, since only if you call
>> both disarm() and wake() you have that guarantee that waiter threads will
>> return.
>>
>> Thanks, Robbin
>>
>>>
>>> Thanks,
>>> David
>>>
>>>
>>>> Inc:
>>>> http://cr.openjdk.java.net/~rehn/8214271/4/inc/webrev/
>>>>
>>>> Full:
>>>> http://cr.openjdk.java.net/~rehn/8214271/4/full/webrev/
>>>>
>>>> /Robbin
>>>>
>>>>>
>>>>> Otherwise this all looks good!
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>
>>>>>> Full:
>>>>>> http://cr.openjdk.java.net/~rehn/8214271/3/full/webrev/
>>>>>>
>>>>>> Thanks, Robbin
>>>>>>
>>>>>> On 11/23/18 5:55 PM, Robbin Ehn wrote:
>>>>>>> Forgot RFR in subject.
>>>>>>>
>>>>>>> /Robbin
>>>>>>>
>>>>>>> On 2018-11-23 17:51, Robbin Ehn wrote:
>>>>>>>> Hi all, please review.
>>>>>>>>
>>>>>>>> When a safepoint is ended we need a way to get back to 100% utilization
>>>>>>>> as fast
>>>>>>>> as possible. 100% utilization means no idle cpu in the system if there is a
>>>>>>>> JavaThread that could be executed. The traditional ways to wake many, e.g.
>>>>>>>> semaphore, pthread_cond, is not implemented with a single syscall
>>>>>>>> instead they
>>>>>>>> typical do one syscall per thread to wake.
>>>>>>>>
>>>>>>>> This change-set contains that primitive, the WaitBarrier, and a gtest
>>>>>>>> for it.
>>>>>>>> No actual users, which is in coming patches.
>>>>>>>>
>>>>>>>> The WaitBarrier solves by doing a cooperative semaphore posting, threads
>>>>>>>> woken
>>>>>>>> will also post. On Linux we can instead directly use a futex and with one
>>>>>>>> syscall wake all. Depending on how many threads and cpus the performance
>>>>>>>> vary,
>>>>>>>> but a good utilization of the machine, just on the edge of saturated,
>>>>>>>> the time to reach 100% utilization is around 3 times faster with the
>>>>>>>> WaitBarrier (where futex is faster than semaphore).
>>>>>>>>
>>>>>>>> Webrev:
>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/webrev/
>>>>>>>>
>>>>>>>> CR:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8214271
>>>>>>>>
>>>>>>>> Passes 100 iterations of gtest on our platforms, both fastdebug and
>>>>>>>> release.
>>>>>>>> And have been stable when used in safepoints (t1-8) (coming patches).
>>>>>>>>
>>>>>>>> Thanks, Robbin
More information about the hotspot-dev
mailing list