Performance regression with IntStream.parallel.sum?
Paul Sandoz
paul.sandoz at oracle.com
Mon Oct 28 07:22:25 PDT 2013
Hi Sergey,
On Oct 28, 2013, at 1:46 PM, Sergey Kuksenko <sergey.kuksenko at oracle.com> wrote:
> Hi All,
> The reason of such behavoir is TieredCompilation which was turned off by
> default in b92 and turned on in b112.
> Here is some data, under jmh (average time per op; uses/op)
>
> b92_NonTiered b92_Tiered b112_NonTiered b112_Tiered
> 1M_seq 1568 5294 1561 13347
> 1M_par 869 12770 802 7044
> 5M_seq 7673 7634 7630 7570
> 5M_par 4042 19789 3670 33147
>
> Moreover, TieredCompilation causes a huge run-to-run variance here,
> expecially for parallel cases.
> You may find my sources and jar files here:
> http://cr.openjdk.java.net/~skuksenko/intstream/tiered/
>
Thanks. The src directory is empty.
Here are the results for my test with (adjusting based on your advice off list):
Options opts = new OptionsBuilder()
.include(".*StreamSumTest.*")
.jvmArgs("-Dbenchmark.n=" + n)
.mode(Mode.AverageTime)
.timeUnit(TimeUnit.NANOSECONDS)
.warmupIterations(10)
.warmupTime(TimeValue.milliseconds(1000))
.measurementIterations(10)
.measurementTime(TimeValue.milliseconds(1000))
.forks(4)
.build();
-XX:-TieredCompilation, N = 100_000
Benchmark Mode Thr Cnt Sec Mean Mean error Units
l.StreamSumTest.testSeq avgt 1 40 1 43509.309 114.487 nsec/op
l.StreamSumTest.testStreamPar avgt 1 40 1 108882.854 29918.402 nsec/op
l.StreamSumTest.testStreamSeq avgt 1 40 1 499711.104 1382.872 nsec/op
-XX:-TieredCompilation, N = 1_000_000
Benchmark Mode Thr Cnt Sec Mean Mean error Units
l.StreamSumTest.testSeq avgt 1 40 1 443011.329 902.065 nsec/op
l.StreamSumTest.testStreamPar avgt 1 40 1 1565053.123 17028.800 nsec/op
l.StreamSumTest.testStreamSeq avgt 1 40 1 467933.044 1074.794 nsec/op
-XX:+TieredCompilation, N = 100_000
Benchmark Mode Thr Cnt Sec Mean Mean error Units
l.StreamSumTest.testSeq avgt 1 40 1 43533.608 82.619 nsec/op
l.StreamSumTest.testStreamPar avgt 1 40 1 165477.693 7557.480 nsec/op
l.StreamSumTest.testStreamSeq avgt 1 40 1 498996.086 1013.645 nsec/op
-XX:+TieredCompilation, N = 1_000_000
Benchmark Mode Thr Cnt Sec Mean Mean error Units
l.StreamSumTest.testSeq avgt 1 40 1 443919.982 969.729 nsec/op
l.StreamSumTest.testStreamPar avgt 1 40 1 1567403.736 18983.551 nsec/op
l.StreamSumTest.testStreamSeq avgt 1 40 1 1595995.987 847523.680 nsec/op
So i am still observing a drop in parallel performance going from N=10^5 to N=10^6.
Paul.
More information about the lambda-dev
mailing list