RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection

Kim Barrett kim.barrett at oracle.com
Thu Jul 5 05:15:44 UTC 2018


> On Jun 27, 2018, at 2:39 PM, Zhengyu Gu <zgu at redhat.com> wrote:
> 
> Hi,
> 
> Please review this small enhancement base on paper [1], that keeps the last successfully stolen queue as one of best-of-2 candidates for work stealing.
> 
> Based on experiments done by Thomas Schatzl and myself, it shows positive impacts on task termination and average pause time.
> 
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8205921
> Webrev: http://cr.openjdk.java.net/~zgu/8205921/webrev.00/index.html
> 
> 
> Test:
>  hotspot_gc on Linux 64 (fastdebug and release)
> 
> 
> [1] Characterizing and Optimizing Hotspot Parallel Garbage
>    Collection on Multicore Systems
>    http://ranger.uta.edu/~jrao/papers/EuroSys18.pdf
> 
> Thanks,
> 
> -Zhengyu

Once set, _last_stolen_queues entries are never invalidated.  So we
may as well initialize the entries to queue_num+1 mod num_queues.
Then get rid of the is_valid test (and the whole notion of validity)
and the (only used once per queue_num in the webrev change) random
selection of k1.

But I think that might not be desirable.  The webrev change's behavior
is to always use the queue chosen for the last steal attempt as one of
the two, even if the last steal attempt failed.  And because the
choice of which of the two to try next prefers that one when they are
both empty, we may be reduced to searching with only one random choice
for a while, even though the one we keep using has repeatedly failed
to yield a result.

An alternative that might be better is, whenever a pop_global fails,
reset the associated last_stolen id to invalid.  This will revert to 2
random choices until we find (at least) one with something we can
steal. Actually, it seems the referenced paper does something similar,
and the webrev code doesn't match the referenced paper.

Why do the last_queue array entries need to be padded? Why not just
add a _last_stolen_queue member to TaskQueueSuper?

I think it is a pre-existing bug that GenericTaskQueueSet::_n is of
type uint, but the associated constructor argument is of type int.  I
think the constructor is wrong in this regard.




More information about the hotspot-gc-dev mailing list