RFR: Optimizing best of two work stealing queue selection

Zhengyu Gu zgu at redhat.com
Wed Jun 27 12:39:32 UTC 2018



On 06/27/2018 08:04 AM, Thomas Schatzl wrote:
> Hi,
> 
> On Mon, 2018-06-25 at 09:09 -0400, Zhengyu Gu wrote:
>> Hi,
>>
>> I did a few runs of Compiler.sunflow benchmark over past weekend,
>> with different task terminators and best-of-2 queue selection
>> algorithms. One  factor that is not reflected in the flags,
>> ShenandoahParallelClaimableQueueSet, that I believe to have impact
>> on termination.
>>
>> It looks like that the optimization always has positive impact, and
>> the results were pretty consistent. Does the result convincing enough
>> for purposing upstream?
>>
> 
> :) To me, yes.

Cool, filed https://bugs.openjdk.java.net/browse/JDK-8205921

Thanks,

-Zhengyu

> 
> Note that this change may not only improve task termination but also
> evacuation in general as the code only goes into the termination
> protocol if it can't steal. So the numbers you present here do not
> necessarily show the whole picture.
> 
> I managed to do a few tests on known-changes sensitive benchmarks I
> typically use, but the worst result I had was that pause times did not
> budge at all (specjbb2005); but I also got nice significant average
> pause time decrease in some cases (for the [2] patch, e.g. on
> BigRamTester [5]).
> 
> When trying to reproduce the results in the paper I was not able to
> reproduce them for either lusearch or sunflow which seemed to show the
> best improvements.
> Parallel seems to fare better at least from the numbers the change
> prints, but I have not seen any impact on pause times or throughput as
> the paper shows.
> (Otoh pause times range in the few ms there anyway; mostly referring to
> figure 10a here)
> 
> My runs were with
> 
> -Xmx<heapsize> -Xms<heapsize> -XX:ParallelGCThreads=10 -XX:+<collector>
> -jar dacapo-9.12-bach.jar -n 20 <benchmark>
> 
> and threads and cpu bound to a single node on almost the same processor
> as in the paper; jdk11 tip.
> 
> Currently preparing a build for our perf regression suite, but running
> that will take some time.
> 
> I would however prefer if the change incorporated the earlier
> suggestions made in this thread before upstreaming; I would probably
> ask about them anyway. ;)
> 
>> * Shenandoah Repo with patch [1]:
>>                                             ShenandoahGC
>>                                           Avg.      Worst
>> +ShenandoahOWST/+OptimizedBestOfTwo     0.145      16
>> +ShenandoahOWST/-OptimizedBestOfTwo     1.552      269
>> -ShenandoahOWST/+OptimizedBestOfTwo     0.341      52
>> -ShenandoahOWST/-OptimizedBestofTwo     2.327      384
>>
>>                                 G1
>>                          Avg.       Worst
>> +OptimizedBestOfTwo    23.049     335
>> -OptimizedBestOfTwo    43.632     930
>>
>>
>>
>> * JDK repo with patch [2]
>>                                 G1
>>                          Avg.       Worst
>> +OptimizedBestOfTwo    17.496     469
>> -OptimizedBestOfTwo    26.670     1062
>>
>>
>> [3] and [4] are full logs.
>>
>> Thanks,
>>
>> -Zhengyu
>>
>>
>> [1] http://cr.openjdk.java.net/~zgu/tq_terminator/shenandoah.patch
>> [2] http://cr.openjdk.java.net/~zgu/tq_terminator/jdk.patch
>> [3] http://cr.openjdk.java.net/~zgu/tq_terminator/shenandoah.log
>> [4] http://cr.openjdk.java.net/~zgu/tq_terminator/jdk.log
> 
> Thomas
> 
> [5] https://bugs.openjdk.java.net/browse/JDK-8152438
> 


More information about the shenandoah-dev mailing list