[intrinsics] performance improvements for the intrinsified version of Objects::hash
Aleksey Shipilev
shade at redhat.com
Thu Mar 7 21:10:24 UTC 2019
On 3/6/19 6:18 PM, Vicente Romero wrote:
> I have produced a new iteration of performance results with a reorganization of the data proposed by
> Alex, see [1], I added a link to the benchmark source
>
> [1]
> http://cr.openjdk.java.net/~vromero/intrinsics_benchmark_results/v5/benchmarkResults_intrinsics_all_data_v5.html
Haven't read the entire thread, Alex asked me for comments about the benchmark itself:
*) These two guys are unused?
static final Integer i23 = 23;
static final Object[] args = { "Bob", i23 };
*) This repetitive block can be put on the IntrinsicsBenchmarks class:
@BenchmarkMode(Mode.Throughput)
@Fork(value = 1)
@Warmup(iterations=3)
@Measurement(iterations=5)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
...leaving only:
@Benchmark
public void testHash0001Int(Blackhole b) {
...
}
*) Is there a reason to do multiple ops per @Benchmark? Why can't it be just:
@Benchmark
public int testHash0001Int() {
int i = 1;
return Objects.hash(i);
}
*) Come to think about it, are those benchmark try to specifically measure the ability to fold local
vars? If so, it could be just inlined?
@Benchmark
public int testHash0001Int() {
return Objects.hash(1);
}
*) If local var is sensible, I guess there is a difference between local "final int i = 1;" and "int
i = 1;" from javac perspective?
*) I guess what you want to check for performance when arguments read from instance fields -- that
would be the target test case that simulates Object.hashCode(), no? JMH would break load commoning
across @Benchmark calls, which makes two families of tests sensible:
@BenchmarkMode(Mode.Throughput)
@Fork(value = 1)
@Warmup(iterations=3)
@Measurement(iterations=5)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
public class IntrinsicsBenchmark {
int i1 = 1;
@Benchmark
public int testHash0001Int_Standalone() {
// Single read of field with unknown value
return Objects.hash(i1);
}
@Benchmark
public int testHash0001Int_Merge() {
// Second read should be able to reuse/fold the value first read got
return Objects.hash(i1) + Objects.hash(i1);
}
}
*) Probably needs the test that accepts already boxed varargs via Object[]?
*) Is there -prof perfasm output for any of those benchmarks, to see how generated code is actually
looking? It might look especially funny for cases that fall off the cliff, like testHash0080String.
-Aleksey
More information about the amber-dev
mailing list