Array addition and array sum Panama benchmarks

Wed Mar 20 18:23:52 UTC 2024

On 20/03/2024 17:18, Maurizio Cimadamore wrote:

> FFM has an advantage here compared to ByteBuffer (and even unsafe) in 
> the sense that we know statically if a var handle is going to perform 
> aligned access or not. So it could be possible _in principle_ to use 
> Unsafe::getDouble or Unsafe::getLong + Double.longBitsToDouble() 
> depending on the var handle characteristics.

Pulling more on this string, I’ve updated my branch:

https://github.com/openjdk/jdk/compare/master...mcimadamore:jdk:AddBenchmark?expand=1

This adds a new tweak: if the memory access var handle is fully aligned 
(meaning it supports atomic access), /and/ we don’t need byte swap, then 
we can use Unsafe::get/putFloat/Double directly.

Nice bump in benchmark:

|Benchmark Mode Cnt Score Error Units AddBenchmark.scalarArrayArray avgt 
30 93.752 ± 1.277 ns/op AddBenchmark.scalarArrayArrayLongStride avgt 30 
490.534 ± 6.185 ns/op AddBenchmark.scalarBufferArray avgt 30 346.950 ± 
1.382 ns/op AddBenchmark.scalarBufferBuffer avgt 30 339.950 ± 1.595 
ns/op AddBenchmark.scalarSegmentArray avgt 30 101.052 ± 0.527 ns/op 
AddBenchmark.scalarSegmentSegment avgt 30 310.086 ± 4.169 ns/op 
AddBenchmark.scalarSegmentSegmentLongStride avgt 30 305.144 ± 3.329 
ns/op AddBenchmark.scalarUnsafeArray avgt 30 96.492 ± 1.391 ns/op 
AddBenchmark.scalarUnsafeUnsafe avgt 30 363.458 ± 3.796 ns/op |

Note how now scalarSegmentArray is as fast as scalarArrayArray (!!)

Cheers
Maurizio

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20240320/8a7deb61/attachment-0001.htm>