Foreign memory access hot loop benchmark
Antoine Chambille
ach at activeviam.com
Tue Sep 22 09:30:13 UTC 2020
Hi guys, I'm following the progress of panama projects with eager interest,
from the point of view of an in-memory database developer.
I wrote 'AddBenchmark' that adds two arrays of numbers, element per
element, and 'SumBenchmark' that sums the numbers in an array.
https://github.com/chamb/panama-benchmarks/blob/master/memory/src/main/java/com/activeviam/test/AddBenchmark.java
https://github.com/chamb/panama-benchmarks/blob/master/memory/src/main/java/com/activeviam/test/SumBenchmark.java
The benchmarks test various memory access techniques, java arrays, unsafe,
memory handles, with and without manual loop unrolling.
The SUM benchmark looks good, performance with memory handles is equivalent
to java arrays and unsafe, and loop unrolling triggers some x4 acceleration
that is largely preserved with memory handles.
In the ADD benchmark results are more diverse, memory handles are about 20%
slower than unsafe, and don't seem to enable automatic vectorization like
arrays. With manual loop unrolling it's worse, it looks like memory handles
don't get optimized at all, looks like a bug maybe.
Benchmark Mode Cnt Score Error
Units
AddBenchmark.scalarArray thrpt 5 5353483.430 ▒ 38313.582
ops/s
AddBenchmark.scalarArrayHandle thrpt 5 5291533.568 ▒ 31917.280
ops/s
AddBenchmark.scalarMHI thrpt 5 1699106.867 ▒ 8131.672
ops/s
AddBenchmark.scalarMHI_v2 thrpt 5 1695513.219 ▒ 23860.597
ops/s
AddBenchmark.scalarUnsafe thrpt 5 1995097.798 ▒ 24783.804
ops/s
AddBenchmark.unrolledArray thrpt 5 6445338.050 ▒ 56050.147
ops/s
AddBenchmark.unrolledArrayHandle thrpt 5 2006794.934 ▒ 49052.503
ops/s
AddBenchmark.unrolledUnsafe thrpt 5 2208072.293 ▒ 24952.234
ops/s
AddBenchmark.unrolledMHI thrpt 5 222453.602 ▒ 3451.839
ops/s
AddBenchmark.unrolledMHI_v2 thrpt 5 114637.718 ▒ 1812.049
ops/s
SumBenchmark.scalarArray thrpt 5 1099167.889 ▒ 6392.060
ops/s
SumBenchmark.scalarArrayHandle thrpt 5 1061798.178 ▒ 186062.917
ops/s
SumBenchmark.scalarArrayLongStride thrpt 5 1030295.241 ▒ 71319.976
ops/s
SumBenchmark.scalarUnsafe thrpt 5 1067789.139 ▒ 4455.897
ops/s
SumBenchmark.scalarMHI thrpt 5 1034607.008 ▒ 30830.150
ops/s
SumBenchmark.unrolledArray thrpt 5 4263489.912 ▒ 35092.986
ops/s
SumBenchmark.unrolledArrayHandle thrpt 5 4228415.985 ▒ 44609.791
ops/s
SumBenchmark.unrolledUnsafe thrpt 5 4228496.447 ▒ 22006.197
ops/s
SumBenchmark.unrolledMHI thrpt 5 3665896.721 ▒ 35988.799
ops/s
Thanks for reading, looking forward to your feedback and possible
improvements!
-Antoine
More information about the panama-dev
mailing list