RFR (S): Single-element buffer in thread-local taskqueues
Roman Kennke
rkennke at redhat.com
Tue Nov 8 23:26:57 UTC 2016
Am Dienstag, den 08.11.2016, 23:01 +0100 schrieb Aleksey Shipilev:
> Hi,
>
> In our work-stealing strategy, we are falling victim to yet another
> non-optimality in division strategy. The best way to do parallel work
> is
> to fork out the tasks, submit everything but one task to the queue,
> recursively execute the remaining task. Pushing the last task is
> redundant, because we will pop it out on the next cycle. (Actually,
> as
> array stride example taught us, this can even set us up for thrashing
> the queue, if somebody stole the task under our feet).
>
> Now, rewriting the closure-heavy code to this pattern would be not
> easy,
> but we can solve this in TaskQueues themselves, enter here:
> http://cr.openjdk.java.net/~shade/shenandoah/taskqueue-buff/webrev.
> 01
> (includes the clean_queues fix, to be pushed separately)
>
> It does the single-entry buffer before the queue, which acts almost
> like
> the local variable for us to pull from. It does not make sense to
> buffer
> more than one task, because it may impede work stealing. Buffering a
> single task is completely safe, because we are definitely are going
> back
> for it, and nobody else should steal.
>
> Testing: hotspot_gc_shenandoah, jcstress-all (quick), microbenchmarks
Great stuff!
> Trims down the mark times for full GC (again, conc GC mark would also
> benefit from this, once we solve other bottlenecks there).
Do you happen to know what those bottlenecks are? Improving full-gc is
great, but ideally Shenandoah should not go there :-)
> Good to go?
Should we maybe put this code into shenandoahTaskQueue.hpp and friends?
Reduce shared code changes, etc ;-)
Other than that: go Alexey, go! :-)
Roman
More information about the shenandoah-dev
mailing list