[intrinsics] performance improvements for the intrinsified version of Objects::hash
Vicente Romero
vicente.romero at oracle.com
Thu Mar 7 21:20:03 UTC 2019
Hi Aleksey,
Thansk for taking a look at this
On 3/7/19 4:10 PM, Aleksey Shipilev wrote:
> On 3/6/19 6:18 PM, Vicente Romero wrote:
>> I have produced a new iteration of performance results with a reorganization of the data proposed by
>> Alex, see [1], I added a link to the benchmark source
>>
>> [1]
>> http://cr.openjdk.java.net/~vromero/intrinsics_benchmark_results/v5/benchmarkResults_intrinsics_all_data_v5.html
> Haven't read the entire thread, Alex asked me for comments about the benchmark itself:
>
> *) These two guys are unused?
>
> static final Integer i23 = 23;
> static final Object[] args = { "Bob", i23 };
no, dead code, will remove it
>
> *) This repetitive block can be put on the IntrinsicsBenchmarks class:
>
> @BenchmarkMode(Mode.Throughput)
> @Fork(value = 1)
> @Warmup(iterations=3)
> @Measurement(iterations=5)
> @OutputTimeUnit(TimeUnit.MILLISECONDS)
ok less verbosity
>
> ...leaving only:
>
> @Benchmark
> public void testHash0001Int(Blackhole b) {
> ...
> }
>
> *) Is there a reason to do multiple ops per @Benchmark? Why can't it be just:
>
> @Benchmark
> public int testHash0001Int() {
> int i = 1;
> return Objects.hash(i);
> }
not really, will simplify it
>
> *) Come to think about it, are those benchmark try to specifically measure the ability to fold local
> vars? If so, it could be just inlined?
>
> @Benchmark
> public int testHash0001Int() {
> return Objects.hash(1);
> }
not specifically local vars but this project is able to produce an ldc
if the argument is / are constants, so we want to measure the case when
the arguments are not constants. The constant case is superfast we don't
have performance issues there
>
> *) If local var is sensible, I guess there is a difference between local "final int i = 1;" and "int
> i = 1;" from javac perspective?
yes: `final int i = 1;` is the same as a constant literal
>
> *) I guess what you want to check for performance when arguments read from instance fields -- that
> would be the target test case that simulates Object.hashCode(), no?
reading instance fields or local variables shouldn't make a big difference
> JMH would break load commoning
> across @Benchmark calls, which makes two families of tests sensible:
>
> @BenchmarkMode(Mode.Throughput)
> @Fork(value = 1)
> @Warmup(iterations=3)
> @Measurement(iterations=5)
> @OutputTimeUnit(TimeUnit.MILLISECONDS)
> @State(Scope.Benchmark)
> public class IntrinsicsBenchmark {
> int i1 = 1;
>
> @Benchmark
> public int testHash0001Int_Standalone() {
> // Single read of field with unknown value
> return Objects.hash(i1);
> }
>
> @Benchmark
> public int testHash0001Int_Merge() {
> // Second read should be able to reuse/fold the value first read got
> return Objects.hash(i1) + Objects.hash(i1);
> }
> }
>
> *) Probably needs the test that accepts already boxed varargs via Object[]?
yes we have separate tests for that
>
> *) Is there -prof perfasm output for any of those benchmarks, to see how generated code is actually
> looking? It might look especially funny for cases that fall off the cliff, like testHash0080String.
yep a big indy invocation
>
> -Aleksey
>
Thanks,
Vicente
More information about the amber-dev
mailing list