RFR(m): 8214271: Fast primitive to wake many threads

Robbin Ehn robbin.ehn at oracle.com
Tue Jan 8 10:21:03 UTC 2019


Hi Dan,

On 12/21/18 11:21 PM, Daniel D. Daugherty wrote:
>> Inc:
>> http://cr.openjdk.java.net/~rehn/8214271/5/inc/webrev/
>> Full:
>> http://cr.openjdk.java.net/~rehn/8214271/5/full/webrev/
> 
> It's been a while since I've reviewed this topic so I'm going with
> the full webrev. Sorry for the long delay in getting back to this:

Great!

> src/hotspot/os/linux/waitBarrier_linux.cpp
>      L35:     check_type(cond, "%s; error='%s' (errno=%s)", msg, 
> os::strerror(err),   \
>          nit - The '%s;' should probably be a '%s:' to match other mesg styles.
> 
>      L41: static int futex(volatile int *addr, int futex_op, int op_arg)
>      L42: {
>          nit - the '{' on L42 should be at the end of L41 (with a space in front 
> of it).
> 
>      L47:   assert(_futex_barrier == 0, "Already armed");
>          Please consider:
>            assert(_futex_barrier == 0, "Should not be already armed: "
>                   "_futex_barrier=%d", _futex_barrier);
> 
>      L53:   assert(_futex_barrier != 0, "Not armed");
>          Please consider:
>            assert(_futex_barrier != 0, "Should be armed/non-zero.");

Fixed all above!

> 
>      L58:   guarantee_with_errno(s > -1, "futex FUTEX_WAKE");
>          Please consider:
>             guarantee_with_errno(s > -1, "futex FUTEX_WAKE failed: s=%d", s);
> 
>      L72:     guarantee_with_errno((s == 0) ||
>      L73:                          (s == -1 && errno == EAGAIN) ||
>      L74:                          (s == -1 && errno == EINTR),
>      L75:                          "futex FUTEX_WAIT");
>          Please consider:
>            "futex FUTEX_WAIT failed: s=%d", s);

This would require a macro change and 's' is of no interest since we know it's 
-1. So we have 3 options here:
- Skip 's' since we know it's -1, as now in v6.
- Hardcode the text to -1 since we know it's -1.
- Fix macro to take one more parameter.

> 
>      L76:     // Return value 0: woken up, but re-check in case of spurious wakeup
>      L78:     // Error EAGAIN: we are already disarmed and so will pass the check
>          Please add a period to the end of these two lines.
> 
> src/hotspot/share/utilities/waitBarrier.hpp
>      I like the new header comment.
> 
>      L102:   // Returns implementation type.
>          "type" or "description"?
> 
>      L109-111: indent needs two more spaces. >
> src/hotspot/share/utilities/waitBarrier_generic.hpp
>      L31: // Except for the barrier tag itself, it uses two counters to keep the 
> semaphore
>          Perhaps:
> 
>            // In addition to the barrier tag, it uses two counters to keep the 
> semaphore
> 
> src/hotspot/share/utilities/waitBarrier_generic.cpp
>      L61:   // Loads of _barrier_threads/_waiters must not float above disarm 
> store.
>      L62:   OrderAccess::fence();
>          Since you are using fence() here, you should also say that "and disarm 
> store
>          must not float below."
> 
>      L73:   OrderAccess::fence();
>          Missing your rationale comment for this fence().
> 
>      L79:     OrderAccess::fence();
>          I was expecting this one to be OrderAccess::loadload().
> 
>          Also, missing your rationale comment for this OrderAccess use.

Fixed all above!

> 
> test/hotspot/gtest/utilities/test_waitBarrier.cpp
>      L59:       OrderAccess::storeload(); // Loads in WB must not float up.
>          Not sure why this storeload() is here instead of being in wait().
>          Seems like wait() should do the "right thing".

I choose only to provide trailing fence for all operation for consistency.
In safepoint patch we already have correct ordering from other operations in 3 
out of 4 places, e.g. only one place needs an explicit leading fence.

> 
> Thumbs up on the code! I don't need to see a new webrev if you choose
> to fix the minor things I pointed out above.

Including changes for David comments:
Full: http://cr.openjdk.java.net/~rehn/8214271/6/full/webrev/
Inc : http://cr.openjdk.java.net/~rehn/8214271/6/inc/webrev/

Thanks, Robbin

> 
> Dan
> 
> 
>>
>> gtest passes thousands of loops locally and hundreds in mach5.
>>
>> Thanks, Robbin
>>
>>>
>>> Thanks,
>>> David
>>>
>>>>>
>>>>> s/Implementation/Implementations/
>>>>
>>>> Fixed
>>>>
>>>>>
>>>>> The fourth line is no longer needed.
>>>>
>>>> Above is the reason I would like to keep the fourth line, since only if you 
>>>> call
>>>> both disarm() and wake() you have that guarantee that waiter threads will
>>>> return.
>>>>
>>>> Thanks, Robbin
>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>
>>>>>> Inc:
>>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/inc/webrev/
>>>>>>
>>>>>> Full:
>>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/full/webrev/
>>>>>>
>>>>>> /Robbin
>>>>>>
>>>>>>>
>>>>>>> Otherwise this all looks good!
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>
>>>>>>>> Full:
>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/3/full/webrev/
>>>>>>>>
>>>>>>>> Thanks, Robbin
>>>>>>>>
>>>>>>>> On 11/23/18 5:55 PM, Robbin Ehn wrote:
>>>>>>>>> Forgot RFR in subject.
>>>>>>>>>
>>>>>>>>> /Robbin
>>>>>>>>>
>>>>>>>>> On 2018-11-23 17:51, Robbin Ehn wrote:
>>>>>>>>>> Hi all, please review.
>>>>>>>>>>
>>>>>>>>>> When a safepoint is ended we need a way to get back to 100% 
>>>>>>>>>> utilization as fast
>>>>>>>>>> as possible. 100% utilization means no idle cpu in the system if there 
>>>>>>>>>> is a
>>>>>>>>>> JavaThread that could be executed. The traditional ways to wake many, 
>>>>>>>>>> e.g.
>>>>>>>>>> semaphore, pthread_cond, is not implemented with a single syscall 
>>>>>>>>>> instead they
>>>>>>>>>> typical do one syscall per thread to wake.
>>>>>>>>>>
>>>>>>>>>> This change-set contains that primitive, the WaitBarrier, and a gtest 
>>>>>>>>>> for it.
>>>>>>>>>> No actual users, which is in coming patches.
>>>>>>>>>>
>>>>>>>>>> The WaitBarrier solves by doing a cooperative semaphore posting, 
>>>>>>>>>> threads woken
>>>>>>>>>> will also post. On Linux we can instead directly use a futex and with one
>>>>>>>>>> syscall wake all. Depending on how many threads and cpus the 
>>>>>>>>>> performance vary,
>>>>>>>>>> but a good utilization of the machine, just on the edge of saturated, 
>>>>>>>>>> the time to reach 100% utilization is around 3 times faster with the 
>>>>>>>>>> WaitBarrier (where futex is faster than semaphore).
>>>>>>>>>>
>>>>>>>>>> Webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/webrev/
>>>>>>>>>>
>>>>>>>>>> CR:
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8214271
>>>>>>>>>>
>>>>>>>>>> Passes 100 iterations of gtest on our platforms, both fastdebug and 
>>>>>>>>>> release.
>>>>>>>>>> And have been stable when used in safepoints (t1-8) (coming patches).
>>>>>>>>>>
>>>>>>>>>> Thanks, Robbin
> 


More information about the hotspot-dev mailing list