Foreign memory access hot loop benchmark

Antoine Chambille ach at activeviam.com
Tue Sep 22 09:30:13 UTC 2020


Hi guys, I'm following the progress of panama projects with eager interest,
from the point of view of an in-memory database developer.

I wrote 'AddBenchmark' that adds two arrays of numbers, element per
element, and 'SumBenchmark' that sums the numbers in an array.
https://github.com/chamb/panama-benchmarks/blob/master/memory/src/main/java/com/activeviam/test/AddBenchmark.java
https://github.com/chamb/panama-benchmarks/blob/master/memory/src/main/java/com/activeviam/test/SumBenchmark.java

The benchmarks test various memory access techniques, java arrays, unsafe,
memory handles, with and without manual loop unrolling.


The SUM benchmark looks good, performance with memory handles is equivalent
to java arrays and unsafe, and loop unrolling triggers some x4 acceleration
that is largely preserved with memory handles.

In the ADD benchmark results are more diverse, memory handles are about 20%
slower than unsafe, and don't seem to enable automatic vectorization like
arrays. With manual loop unrolling it's worse, it looks like memory handles
don't get optimized at all, looks like a bug maybe.




Benchmark                            Mode  Cnt        Score        Error
Units
AddBenchmark.scalarArray            thrpt    5  5353483.430 ▒  38313.582
ops/s
AddBenchmark.scalarArrayHandle      thrpt    5  5291533.568 ▒  31917.280
ops/s
AddBenchmark.scalarMHI              thrpt    5  1699106.867 ▒   8131.672
ops/s
AddBenchmark.scalarMHI_v2           thrpt    5  1695513.219 ▒  23860.597
ops/s
AddBenchmark.scalarUnsafe           thrpt    5  1995097.798 ▒  24783.804
ops/s
AddBenchmark.unrolledArray          thrpt    5  6445338.050 ▒  56050.147
ops/s
AddBenchmark.unrolledArrayHandle    thrpt    5  2006794.934 ▒  49052.503
ops/s
AddBenchmark.unrolledUnsafe         thrpt    5  2208072.293 ▒  24952.234
ops/s
AddBenchmark.unrolledMHI            thrpt    5   222453.602 ▒   3451.839
ops/s
AddBenchmark.unrolledMHI_v2         thrpt    5   114637.718 ▒   1812.049
ops/s

SumBenchmark.scalarArray            thrpt    5  1099167.889 ▒   6392.060
ops/s
SumBenchmark.scalarArrayHandle      thrpt    5  1061798.178 ▒ 186062.917
ops/s
SumBenchmark.scalarArrayLongStride  thrpt    5  1030295.241 ▒  71319.976
ops/s
SumBenchmark.scalarUnsafe           thrpt    5  1067789.139 ▒   4455.897
ops/s
SumBenchmark.scalarMHI              thrpt    5  1034607.008 ▒  30830.150
ops/s
SumBenchmark.unrolledArray          thrpt    5  4263489.912 ▒  35092.986
ops/s
SumBenchmark.unrolledArrayHandle    thrpt    5  4228415.985 ▒  44609.791
ops/s
SumBenchmark.unrolledUnsafe         thrpt    5  4228496.447 ▒  22006.197
ops/s
SumBenchmark.unrolledMHI            thrpt    5  3665896.721 ▒  35988.799
ops/s


Thanks for reading, looking forward to your feedback and possible
improvements!

-Antoine


More information about the panama-dev mailing list