RFR (S): Chunked array processing should first push the continuation

Fri Oct 28 10:02:11 UTC 2016

Awesome! Please push!

BTW: Can we collect all those little benchmarks into a proper suite?

Roman

Am Freitag, den 28.10.2016, 11:18 +0200 schrieb Aleksey Shipilev:
> Hi,
> 
> This is one of those "LOL" performance bugs. If you profile the
> ArrayFragger test [1] that eventually scans a large array, you will
> notice that TaskQueues are the hotspots with lots of stealing. If you
> wonder why, this is why: in chunked processing we *first* process our
> chunk, and then let others know we have more work (of course, next
> thing
> you know, pulling that work under their feet).
> 
> The solution is to first fork out the continuation, and then process
> our
> own chunk in solitude:
> 
> http://cr.openjdk.java.net/~shade/shenandoah/concmark-cont-first/webr
> ev.01/
> 
> Improves the stress test in question by very much:
> 
> Benchmark         (ldsMB)
> (objSize)  Mode  Cnt    Score    Error  Units
> 
> # Before
> ArrayFragger.test    500       100  avgt  100  903.449 ±
> 23.912  ns/op
> 
> # After
> ArrayFragger.test     500      100  avgt  100  581.849 ±
> 53.288  ns/op
> 
> 
> Testing: hotspot_gc_shenandoah
> 
> Thanks,
> -Aleksey
> 
> [1]
> http://cr.openjdk.java.net/~shade/shenandoah/shenandoah-gc-bench/src/
> main/java/org/openjdk/shenandoah/fragger/ArrayFragger.java
>