Array addition and array sum Panama benchmarks
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Wed Mar 20 10:26:19 UTC 2024
Hi Antoine,
thanks for the benchmark. From the numbers you are getting in the
AddBenchmark, my gut feeling is that, for memory segments, bound checks
are not being hoisted outside the loop. That would cause the kind of
degradation you are seeing here. I'm also surprised to see, for that
benchmark, that Unsafe is > 2x faster than using plain arrays, after all
the size of the array is a loop invariant, and no check should occur
there. On top of my head, I recall a similar issue with a benchmark in
our repository [1] (you will probably recognize the shape there, as it's
very similar to yours). In that case, to get to optimal performance,
some extra casts to `long` needed to be added as C2 cannot yet optimize
loops with that particular shape. Note that all the bound check analysis
on memory segments is built on longs (unlike arrays and byte buffers)
and we rely on C2 to optimize common cases where accessed offset is
clearly a "small long". In some cases this check doesn't work (yet), and
some "manual help" is needed. From my note with a conversation with
Roland (who did most of the optimization work here):
> The expectation is that the loop variable and the exit test operate on a single type
At the time, we had bigger fishes to fry, but if this turns out to be
the reason behind the numbers you are seeing, then it might be time to
look again and try to fix this.
Cheers
Maurizio
[1] -
https://github.com/openjdk/jdk/blob/master/test/micro/org/openjdk/bench/java/lang/foreign/UnrolledAccess.java
On 20/03/2024 09:00, Antoine Chambille wrote:
> Hi everyone,
>
> I'm looking at two array arithmetic benchmarks with Panama.
> https://github.com/chamb/panama-benchmarks
>
> AddBenchmark: benchmark the element-wise addition of two arrays of
> numbers. We test over standard Java arrays and (off-heap) native
> memory, via array access, Unsafe and MemorySegment. Using and not
> using the vector API.
>
> SumBenchmark: Sum all the elements in an array of numbers. We
> benchmark over standard Java arrays and (off-heap) native memory, via
> array access, Unsafe and MemorySegment. Using and not using the vector
> API.
>
> I'm building openjdk from the master at
> https://github.com/openjdk/panama-vector
> Windows laptop with Intel core i9-11950H.
>
> Impressive to perform SIMD on native memory in pure Java! And I hope
> it's possible to optimize it further.
>
> AddBenchmark
> .scalarArrayArray 4741341.171 ops/s
> .scalarArrayArrayLongStride 973926.689 ops/s
> .scalarSegmentArray 1809480.000 ops/s
> .scalarSegmentSegment 1231606.029 ops/s
> .scalarUnsafeArray 10972240.434 ops/s
> .scalarUnsafeUnsafe 1246565.503 ops/s
> .unrolledArrayArray 1236491.068 ops/s
> .unrolledSegmentArray 1787171.351 ops/s
> .unrolledUnsafeArray 5700087.751 ops/s
> .unrolledUnsafeUnsafe 1236456.434 ops/s
> .vectorArrayArray 7252565.080 ops/s
> .vectorArraySegment 6938948.826 ops/s
> .vectorSegmentArray 4953042.042 ops/s
> .vectorSegmentSegment 4606278.152 ops/s
>
> Loops over arrays seem automatically optimized, but not when the loop
> has a 'long' stride.
> Reading from Segment seems to defeat loop optimisations and/or add
> overhead. It gets worse when writing to Segment.
> Manual unrolling makes things worse in all cases.
> The performance of 'scalarUnsafeArray' (read with Unsafe, write with
> array) is twice faster than almost anything else.
> The vector API is fast and consistent, but maybe not at its full
> potential, and the use of Segment degrades performance.
>
>
> SumBenchmark
> .scalarArray 671030.727 ops/s
> .scalarUnsafe 669296.228 ops/s
> .unrolledArray 2600591.019 ops/s
> .unrolledUnsafe 2448826.428 ops/s
> .vectorArrayV1 7313657.874 ops/s
> .vectorArrayV2 2239302.424 ops/s
> .vectorSegmentV1 7470192.252 ops/s
> .vectorSegmentV2 2183291.818 ops/s
>
> This is more in line. Manual unrolling seems to enable some level of
> optimization, and then the vector API gives the best performance.
>
>
> Best,
> -Antoine
More information about the panama-dev
mailing list