Allocation of array copy can be eliminated in particular cases
Сергей Цыпанов
sergei.tsypanov at yandex.ru
Wed Nov 20 21:30:48 UTC 2019
Hello Vladimir,
thank you for your response!
> Moreover, the transformation is already there:
>
> http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/share/opto/memnode.cpp#l2388
Comment in line 2389 seems confusing to me:
// This works even if the length is not constant (clone or newArray).
When we clone array isn't the length constant and equal to the length of original array? I guess it cannot be different.
> I haven't looked into the benchmarks you mentioned, but it looks like
> cloned_array.length access is not the reason why cloned array is still
> there.
Once I thought that cloned array is retained at run time because it's returned from method in original benchmark:
@Benchmark
public int getParameterTypes() { return method.getParameterTypes().length; }
To check whether this speculation is correct I've tried to change my benchmark in order to strip any additional logic from it [1]:
@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class ArrayAllocationEliminationBenchmark {
private int length = 10;
//...
@Benchmark
public int baseline() {
return new int[length].length;
}
@Benchmark
public int baselineClone() {
return new int[length].clone().length;
}
//...
}
Here I don't see any reason for runtime to hold cloned array:
1) int is returned from the method
2) cloned array doesn't escape the place where it's created
So the cloned array should be dropped off, but according to benchmarking results it's not:
JDK 11
Mode Cnt Score Error Units
baseline avgt 25 10,860 ± 0,604 ns/op
baseline:·gc.alloc.rate avgt 25 4703,477 ± 215,986 MB/sec
baseline:·gc.alloc.rate.norm avgt 25 56,000 ± 0,001 B/op
baseline:·gc.churn.CodeHeap_'non-profiled_nmethods' avgt 25 0,002 ± 0,001 MB/sec
baseline:·gc.churn.CodeHeap_'non-profiled_nmethods'.norm avgt 25 ≈ 10⁻⁴ B/op
baseline:·gc.churn.G1_Old_Gen avgt 25 4711,586 ± 218,439 MB/sec
baseline:·gc.churn.G1_Old_Gen.norm avgt 25 56,094 ± 0,084 B/op
baseline:·gc.count avgt 25 5400,000 counts
baseline:·gc.time avgt 25 3926,000 ms
baselineClone avgt 25 21,906 ± 1,234 ns/op
baselineClone:·gc.alloc.rate avgt 25 4667,440 ± 248,731 MB/sec
baselineClone:·gc.alloc.rate.norm avgt 25 112,000 ± 0,001 B/op
baselineClone:·gc.churn.CodeHeap_'non-profiled_nmethods' avgt 25 0,008 ± 0,002 MB/sec
baselineClone:·gc.churn.CodeHeap_'non-profiled_nmethods'.norm avgt 25 ≈ 10⁻⁴ B/op
baselineClone:·gc.churn.G1_Old_Gen avgt 25 4675,250 ± 247,341 MB/sec
baselineClone:·gc.churn.G1_Old_Gen.norm avgt 25 112,192 ± 0,162 B/op
baselineClone:·gc.count avgt 25 5489,000 counts
baselineClone:·gc.time avgt 25 4042,000 ms
JDK 13
Mode Cnt Score Error Units
baseline avgt 25 10,014 ± 0,227 ns/op
baseline:·gc.alloc.rate avgt 25 5082,913 ± 110,593 MB/sec
baseline:·gc.alloc.rate.norm avgt 25 56,000 ± 0,001 B/op
baseline:·gc.churn.G1_Eden_Space avgt 25 5092,013 ± 110,500 MB/sec
baseline:·gc.churn.G1_Eden_Space.norm avgt 25 56,100 ± 0,076 B/op
baseline:·gc.churn.G1_Survivor_Space avgt 25 0,005 ± 0,001 MB/sec
baseline:·gc.churn.G1_Survivor_Space.norm avgt 25 ≈ 10⁻⁴ B/op
baseline:·gc.count avgt 25 5753,000 counts
baseline:·gc.time avgt 25 3733,000 ms
baselineClone avgt 25 26,619 ± 1,405 ns/op
baselineClone:·gc.alloc.rate avgt 25 3837,924 ± 185,292 MB/sec
baselineClone:·gc.alloc.rate.norm avgt 25 112,000 ± 0,001 B/op
baselineClone:·gc.churn.G1_Eden_Space avgt 25 3844,010 ± 185,460 MB/sec
baselineClone:·gc.churn.G1_Eden_Space.norm avgt 25 112,178 ± 0,168 B/op
baselineClone:·gc.churn.G1_Survivor_Space avgt 25 0,008 ± 0,001 MB/sec
baselineClone:·gc.churn.G1_Survivor_Space.norm avgt 25 ≈ 10⁻⁴ B/op
baselineClone:·gc.count avgt 25 4668,000 counts
baselineClone:·gc.time avgt 25 2923,000 ms
>From this output I conclude that either I miss something from understanding of how compiler and runtime work, or this is a bug.
I will be happy to understand which of the two is correct :)
There is also good news though, the latest Graal can drop allocation off for baseline method [2]
With best regards,
Sergey Tsypanov
1) https://github.com/stsypanov/logeek-night-benchmark/blob/master/benchmark-runners/src/main/java/com/luxoft/logeek/benchmark/array/ArrayAllocationEliminationBenchmark.java
2) https://github.com/oracle/graal/issues/1847
More information about the hotspot-compiler-dev
mailing list