RFR(m): 8214271: Fast primitive to wake many threads

Tue Jan 1 23:35:13 UTC 2019

Hi Robbin,

On 21/12/2018 7:45 pm, Robbin Ehn wrote:
> Hi David,
> 
> On 2018-12-21 04:17, David Holmes wrote:
>>
>> Got it - subtle.
>>
>> Further this sounds like a race that could lead to bugs if not used 
>> very carefully ie. you can't assume between disarm() and wake() that 
>> all threads are blocked.
> 
> I didn't realize how subtle this is. I think your original comment that
> disarm/wake should be one operation was spot on.
> Investigating... thinking... testing... yes I think this will work, fixed!
> Sorry for not looking more into this before.

I'm now curious how this will actually work in the context of the 
safepoint changes?

>>
>> I think perhaps this needs to be expanded to make this more obvious:
>>
>>    68 //    - A call to wait(tag) will block if the barrier is armed 
>> with the value
>>    69 //      'tag'; else it will return immediately.
>>    70 //    - A blocked thread is eligible to execute again once the 
>> barrier is
>>    71 //      disarmed and wake() has been called.
>> +          - A call to wait(tag) that would block if it continued, but 
>> instead
>> +            is descheduled, may return immediately if scheduled after a
>> +           call to disarm(), but before the call to wake().
>>
>> It also made me realize that in the general case (not when used with 
>> safepoints I think due to other state checks) a wake() may stall due 
>> to threads with a previous tag entering the wait() late.
> 
> I added a double checking in the semaphore version, this means both
> implementation should have progress guarantee.
> 
> Making this v5 a bit large due to a lot of comments being changed.
> 
> Inc:
> http://cr.openjdk.java.net/~rehn/8214271/5/inc/webrev/

Nit: I would have kept disarm() rather than wake() as I like the 
arm/disarm duality.

   void GenericWaitBarrier::wait(int barrier_tag) {
     assert(barrier_tag != 0, "Trying to wait on disarmed value");
+   if (barrier_tag == 0 && barrier_tag != _barrier_tag) {
+     OrderAccess::fence();
+     return;
+   }

I don't understand what the above is doing. A barrier_tag of 0 is a 
programming error caught during testing in debug builds. You don't need 
to account for it being 0 in product because this isn't something that 
can come in from an external source - we have full code control here. 
And even if you want to be this paranoid why would you need the fence?

Thanks,
David
-----

> Full:
> http://cr.openjdk.java.net/~rehn/8214271/5/full/webrev/
> 
> gtest passes thousands of loops locally and hundreds in mach5.
> 
> Thanks, Robbin
> 
>>
>> Thanks,
>> David
>>
>>>>
>>>> s/Implementation/Implementations/
>>>
>>> Fixed
>>>
>>>>
>>>> The fourth line is no longer needed.
>>>
>>> Above is the reason I would like to keep the fourth line, since only 
>>> if you call
>>> both disarm() and wake() you have that guarantee that waiter threads 
>>> will
>>> return.
>>>
>>> Thanks, Robbin
>>>
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>
>>>>> Inc:
>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/inc/webrev/
>>>>>
>>>>> Full:
>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/full/webrev/
>>>>>
>>>>> /Robbin
>>>>>
>>>>>>
>>>>>> Otherwise this all looks good!
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>
>>>>>>> Full:
>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/3/full/webrev/
>>>>>>>
>>>>>>> Thanks, Robbin
>>>>>>>
>>>>>>> On 11/23/18 5:55 PM, Robbin Ehn wrote:
>>>>>>>> Forgot RFR in subject.
>>>>>>>>
>>>>>>>> /Robbin
>>>>>>>>
>>>>>>>> On 2018-11-23 17:51, Robbin Ehn wrote:
>>>>>>>>> Hi all, please review.
>>>>>>>>>
>>>>>>>>> When a safepoint is ended we need a way to get back to 100% 
>>>>>>>>> utilization as fast
>>>>>>>>> as possible. 100% utilization means no idle cpu in the system 
>>>>>>>>> if there is a
>>>>>>>>> JavaThread that could be executed. The traditional ways to wake 
>>>>>>>>> many, e.g.
>>>>>>>>> semaphore, pthread_cond, is not implemented with a single 
>>>>>>>>> syscall instead they
>>>>>>>>> typical do one syscall per thread to wake.
>>>>>>>>>
>>>>>>>>> This change-set contains that primitive, the WaitBarrier, and a 
>>>>>>>>> gtest for it.
>>>>>>>>> No actual users, which is in coming patches.
>>>>>>>>>
>>>>>>>>> The WaitBarrier solves by doing a cooperative semaphore 
>>>>>>>>> posting, threads woken
>>>>>>>>> will also post. On Linux we can instead directly use a futex 
>>>>>>>>> and with one
>>>>>>>>> syscall wake all. Depending on how many threads and cpus the 
>>>>>>>>> performance vary,
>>>>>>>>> but a good utilization of the machine, just on the edge of 
>>>>>>>>> saturated, the time to reach 100% utilization is around 3 times 
>>>>>>>>> faster with the WaitBarrier (where futex is faster than 
>>>>>>>>> semaphore).
>>>>>>>>>
>>>>>>>>> Webrev:
>>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/webrev/
>>>>>>>>>
>>>>>>>>> CR:
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8214271
>>>>>>>>>
>>>>>>>>> Passes 100 iterations of gtest on our platforms, both fastdebug 
>>>>>>>>> and release.
>>>>>>>>> And have been stable when used in safepoints (t1-8) (coming 
>>>>>>>>> patches).
>>>>>>>>>
>>>>>>>>> Thanks, Robbin