Blackholes: improve symmetry for different data types and values
Aleksey Shipilev
aleksey.shipilev at oracle.com
Wed Jul 17 04:44:29 PDT 2013
Hi,
I have the external bug report that our Blackhole exhibit asymmetry in
performance when being fed with different values. Disassembly shows
weird things which seem hard to evade with our current blackholing
scheme with primitives.
With the headlines like that, I pushed:
http://hg.openjdk.java.net/code-tools/jmh/rev/8046f2c4fce7
...which now provides th symmetrical performance against all the data
types and all the values, at the expense of few additional cycles per
consume(). Note that we already have lots of overheads on that scale
(the most prominent one being long "operations" counter incremented on
every iteration).
This is what we had before on 2x8x2 Sandybridge, Solaris 10, JDK 8b84:
(single threaded version only, it scales nearly-perfectly)
>
> Benchmark Mode Thr Cnt Sec Mean Mean error Units
> o.o.j.b.BlackholeBench.baseline avgt 1 5 1 0.287 0.000 nsec/op
> o.o.j.b.BlackholeBench.explicit_testArray avgt 1 5 1 3.050 0.001 nsec/op
> o.o.j.b.BlackholeBench.explicit_testBoolean avgt 1 5 1 0.717 0.000 nsec/op
> o.o.j.b.BlackholeBench.explicit_testByte avgt 1 5 1 0.716 0.000 nsec/op
> o.o.j.b.BlackholeBench.explicit_testChar avgt 1 5 1 0.716 0.000 nsec/op
> o.o.j.b.BlackholeBench.explicit_testDouble avgt 1 5 1 0.860 0.000 nsec/op
> o.o.j.b.BlackholeBench.explicit_testFloat avgt 1 5 1 0.865 0.007 nsec/op
> o.o.j.b.BlackholeBench.explicit_testInt avgt 1 5 1 0.716 0.000 nsec/op
> o.o.j.b.BlackholeBench.explicit_testLong avgt 1 5 1 0.717 0.002 nsec/op
> o.o.j.b.BlackholeBench.explicit_testObject avgt 1 5 1 3.050 0.001 nsec/op
> o.o.j.b.BlackholeBench.explicit_testShort avgt 1 5 1 0.716 0.000 nsec/op
> o.o.j.b.BlackholeBench.implicit_testArray avgt 1 5 1 3.053 0.001 nsec/op
> o.o.j.b.BlackholeBench.implicit_testBoolean avgt 1 5 1 1.434 0.001 nsec/op
> o.o.j.b.BlackholeBench.implicit_testByte avgt 1 5 1 0.716 0.000 nsec/op
> o.o.j.b.BlackholeBench.implicit_testChar avgt 1 5 1 0.716 0.000 nsec/op
> o.o.j.b.BlackholeBench.implicit_testDouble avgt 1 5 1 0.885 0.032 nsec/op
> o.o.j.b.BlackholeBench.implicit_testFloat avgt 1 5 1 0.860 0.000 nsec/op
> o.o.j.b.BlackholeBench.implicit_testInt avgt 1 5 1 0.717 0.001 nsec/op
> o.o.j.b.BlackholeBench.implicit_testLong avgt 1 5 1 0.717 0.001 nsec/op
> o.o.j.b.BlackholeBench.implicit_testObject avgt 1 5 1 3.052 0.001 nsec/op
> o.o.j.b.BlackholeBench.implicit_testShort avgt 1 5 1 0.716 0.000 nsec/op
And this is what we will have now:
> Benchmark Mode Thr Cnt Sec Mean Mean error Units
> o.o.j.b.BlackholeBench.baseline avgt 1 5 1 0.287 0.000 nsec/op
> o.o.j.b.BlackholeBench.explicit_testArray avgt 1 5 1 2.883 0.001 nsec/op
> o.o.j.b.BlackholeBench.explicit_testBoolean avgt 1 5 1 2.882 0.011 nsec/op
> o.o.j.b.BlackholeBench.explicit_testByte avgt 1 5 1 2.881 0.010 nsec/op
> o.o.j.b.BlackholeBench.explicit_testChar avgt 1 5 1 2.882 0.010 nsec/op
> o.o.j.b.BlackholeBench.explicit_testDouble avgt 1 5 1 2.881 0.009 nsec/op
> o.o.j.b.BlackholeBench.explicit_testFloat avgt 1 5 1 2.883 0.002 nsec/op
> o.o.j.b.BlackholeBench.explicit_testInt avgt 1 5 1 2.879 0.001 nsec/op
> o.o.j.b.BlackholeBench.explicit_testLong avgt 1 5 1 2.880 0.001 nsec/op
> o.o.j.b.BlackholeBench.explicit_testObject avgt 1 5 1 2.879 0.001 nsec/op
> o.o.j.b.BlackholeBench.explicit_testShort avgt 1 5 1 2.877 0.009 nsec/op
> o.o.j.b.BlackholeBench.implicit_testArray avgt 1 5 1 2.877 0.009 nsec/op
> o.o.j.b.BlackholeBench.implicit_testBoolean avgt 1 5 1 2.877 0.010 nsec/op
> o.o.j.b.BlackholeBench.implicit_testBoolean_false avgt 1 5 1 2.877 0.009 nsec/op
> o.o.j.b.BlackholeBench.implicit_testBoolean_falseF avgt 1 5 1 2.877 0.009 nsec/op
> o.o.j.b.BlackholeBench.implicit_testBoolean_true avgt 1 5 1 2.880 0.001 nsec/op
> o.o.j.b.BlackholeBench.implicit_testBoolean_trueF avgt 1 5 1 2.879 0.001 nsec/op
> o.o.j.b.BlackholeBench.implicit_testByte avgt 1 5 1 2.877 0.009 nsec/op
> o.o.j.b.BlackholeBench.implicit_testChar avgt 1 5 1 2.879 0.001 nsec/op
> o.o.j.b.BlackholeBench.implicit_testDouble avgt 1 5 1 2.879 0.001 nsec/op
> o.o.j.b.BlackholeBench.implicit_testFloat avgt 1 5 1 2.877 0.009 nsec/op
> o.o.j.b.BlackholeBench.implicit_testInt avgt 1 5 1 2.879 0.001 nsec/op
> o.o.j.b.BlackholeBench.implicit_testLong avgt 1 5 1 2.879 0.001 nsec/op
> o.o.j.b.BlackholeBench.implicit_testObject avgt 1 5 1 2.879 0.000 nsec/op
> o.o.j.b.BlackholeBench.implicit_testShort avgt 1 5 1 2.878 0.001 nsec/op
We can still get it down to 1.6ns per call, but this will only work for
implicit blackholes, and I'm contemplating if I want to break the
symmetry in this way.
-Aleksey.
More information about the jmh-dev
mailing list