RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection

Zhengyu Gu zgu at redhat.com
Thu Jul 5 16:08:55 UTC 2018


Hi Kim and Thomas,

Thanks for reviewing.

On 07/05/2018 03:54 AM, Thomas Schatzl wrote:
> Hi,
> 
> On Thu, 2018-07-05 at 01:15 -0400, Kim Barrett wrote:
>>> On Jun 27, 2018, at 2:39 PM, Zhengyu Gu <zgu at redhat.com> wrote:
>>>
>>> Hi,
>>>
>>> Please review this small enhancement base on paper [1], that keeps
>>> the last successfully stolen queue as one of best-of-2 candidates
>>> for work stealing.
>>>
>>> Based on experiments done by Thomas Schatzl and myself, it shows
>>> positive impacts on task termination and average pause time.
>>>
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205921
>>> Webrev: http://cr.openjdk.java.net/~zgu/8205921/webrev.00/index.htm
>>> l
>>>
>>>
>>> Test:
>>>   hotspot_gc on Linux 64 (fastdebug and release)
>>>
>>>
>>> [1] Characterizing and Optimizing Hotspot Parallel Garbage
>>>     Collection on Multicore Systems
>>>     http://ranger.uta.edu/~jrao/papers/EuroSys18.pdf
>>>
>>> Thanks,
>>>
>>> -Zhengyu
>>
>> Once set, _last_stolen_queues entries are never invalidated.  So we
>> may as well initialize the entries to queue_num+1 mod num_queues.
>> Then get rid of the is_valid test (and the whole notion of validity)
>> and the (only used once per queue_num in the webrev change) random
>> selection of k1.
>>
>> But I think that might not be desirable.  The webrev change's
>> behavior is to always use the queue chosen for the last steal attempt
>> as one of the two, even if the last steal attempt failed.  And
>> because the choice of which of the two to try next prefers that one
>> when they are both empty, we may be reduced to searching with only
>> one random choice for a while, even though the one we keep using has
>> repeatedly failed to yield a result.
>>
>> An alternative that might be better is, whenever a pop_global fails,
>> reset the associated last_stolen id to invalid.  This will revert to
>> 2 random choices until we find (at least) one with something we can
>> steal. Actually, it seems the referenced paper does something
>> similar, and the webrev code doesn't match the referenced paper.
> 
> That may explain why my perf results are different to the paper that I
> was planning to investigate :) Nice find.

Sorry, my bad.

> 
>> Why do the last_queue array entries need to be padded? Why not just
>> add a _last_stolen_queue member to TaskQueueSuper?
> 
> The _last_stolen_queue is associated to a (stealing) thread, not the
> queue. Multiple threads might have the same queue as current steal
> target.
> 
> One other option I discussed is instead of this array of PaddedQueueId
> (which I would rename as TaskQueueThreadLocal or
> TaskQueueStealLocals/Context because I can see adding more in the
> future) would be passing this around like the seed parameter to
> steal_best_of_2 (and actually put the seed parameter in there too).
> 
> It's a bit weird to me to pass two different kinds of thread locals
> related to work stealing two different ways.

I would prefer to pass down TaskQueueStealContext, just like seed, to 
avoid this padded queue id array. However, it means that we have to 
update all call sites, which I am not comfortable to do at this time. 
Could we make this a future item?

Updated webrev:

http://cr.openjdk.java.net/~zgu/8205921/webrev.01/index.html

Thanks,

-Zhengyu


> 
> The padding is to avoid potential false sharing issues as otherwise
> last_stolen_id's of different threads end up on the same cache line.
> And the writes of different threads to disjoint locations would likely
> invalidate the cache line all the time. Just to avoid a potential
> performance issue here.
> 
>> I think it is a pre-existing bug that GenericTaskQueueSet::_n is of
>> type uint, but the associated constructor argument is of type int.  I
>> think the constructor is wrong in this regard.
> 
> 
> - please use CamelCase for the INVALID_QUEUE_ID constant.
> 
> - there are some superfluous spaces at the end-of-line, but that would
> be flushed out before pushing anyway.
> 
> Thanks,
>    Thomas
> 



More information about the hotspot-gc-dev mailing list