RFR (S): Chunked array processing should first push the continuation
Aleksey Shipilev
shade at redhat.com
Fri Oct 28 09:18:01 UTC 2016
Hi,
This is one of those "LOL" performance bugs. If you profile the
ArrayFragger test [1] that eventually scans a large array, you will
notice that TaskQueues are the hotspots with lots of stealing. If you
wonder why, this is why: in chunked processing we *first* process our
chunk, and then let others know we have more work (of course, next thing
you know, pulling that work under their feet).
The solution is to first fork out the continuation, and then process our
own chunk in solitude:
http://cr.openjdk.java.net/~shade/shenandoah/concmark-cont-first/webrev.01/
Improves the stress test in question by very much:
Benchmark (ldsMB) (objSize) Mode Cnt Score Error Units
# Before
ArrayFragger.test 500 100 avgt 100 903.449 ± 23.912 ns/op
# After
ArrayFragger.test 500 100 avgt 100 581.849 ± 53.288 ns/op
Testing: hotspot_gc_shenandoah
Thanks,
-Aleksey
[1]
http://cr.openjdk.java.net/~shade/shenandoah/shenandoah-gc-bench/src/main/java/org/openjdk/shenandoah/fragger/ArrayFragger.java
More information about the shenandoah-dev
mailing list