RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection
Thomas Schatzl
thomas.schatzl at oracle.com
Thu Jul 5 19:47:49 UTC 2018
Hi,
On Thu, 2018-07-05 at 14:49 -0400, Kim Barrett wrote:
> > On Jul 5, 2018, at 12:08 PM, Zhengyu Gu <zgu at redhat.com> wrote:
> >
> > Hi Kim and Thomas,
> >
> > Thanks for reviewing.
> >
> > On 07/05/2018 03:54 AM, Thomas Schatzl wrote:
> > > Hi,
> > > On Thu, 2018-07-05 at 01:15 -0400, Kim Barrett wrote:
> > > > > On Jun 27, 2018, at 2:39 PM, Zhengyu Gu <zgu at redhat.com>
> > > > > wrote:
> > > >
> > > > […]
> > > > An alternative that might be better is, whenever a pop_global
> > > > fails, reset the associated last_stolen id to invalid. This
> > > > will revert to 2 random choices until we find (at least) one
> > > > with something we can steal. Actually, it seems the referenced
> > > > paper does something similar, and the webrev code doesn't match
> > > > the referenced paper.
> > >
> > > That may explain why my perf results are different to the paper
> > > that I was planning to investigate :) Nice
> > > find.
> >
> > Sorry, my bad.
I have been looking into this a bit and finally (with some patch from
me that fixes the changes too) and some additional probes (using the
TASKQUEUE_STATS "infrastructure") I am starting to get meaningful
results.
More about that later.
In any case the technique looks like a nice improvement at least in
steal attempts and steal/steal attempts ratio on some bigger tests, but
I need to update my code again it seems :)
I can add the changes to the TASKQUEUE_STATS logging later btw.
> >
> > […]
> > Updated webrev:
> >
> > http://cr.openjdk.java.net/~zgu/8205921/webrev.01/index.html
>
> src/hotspot/share/gc/shared/taskqueue.inline.hpp
> 255 if (sz2 > sz1) {
> 256 sel_k = k2;
> 257 suc = _queues[k2]->pop_global(t);
> 258 } else {
> 259 sel_k = k1;
> 260 suc = _queues[k1]->pop_global(t);
> 261 }
>
> The paper avoids the steal attempt when both potential victims have a
> size of zero, e.g. insert another clause:
>
> } else if (sz1 == 0) {
> sel_k = k1; // Might be needed to avoid uninitialized variable
> warnings?
> suc = false;
> } else {
> ...
>
> There is a race condition between obtaining the size and checking it
> here, but I don't think that's important. The point is to avoid an
> expensive steal attempt when it is very likely to fail.
>
There is another bug in the existing code: current Hotspot collectors
all reuse a single task queue set. So since the queue id's are only
initialized once at startup, there will be some initial use of a
suboptimal queue.
I assume Shenandoah does not need a reset because it creates new
taskqueuesets whenever it needs them (and frees them afterwards).
The current design of passing stealing-local information (the seed)
makes it clear that the owner of that variable needs to initialize it.
At this time I have no preference on Kim's suggestion to put these
variables into the queue if you asked me. I would tend to encapsulate
the mechanism as much as possible though.
I do think if Shenandoah wants to put this information into the
GCThreadLocalBlock (for what reason?) it is probably most flexible to
pass these things as kind of context to the steal_best_of_2() method. I
do not think it is desirable to have a second copy of the taskqueue*
code around; I can't see how else one implementation can use the
TaskQueueSet local queue ids and the other use the same information
from somewhere else right now.
Thanks,
Thomas
More information about the hotspot-gc-dev
mailing list