Array addition and array sum Panama benchmarks

Antoine Chambille ach at activeviam.com
Mon Sep 30 12:04:24 UTC 2024


Hello everyone,

I've rebuilt the latest OpenJDK (24) from
https://github.com/openjdk/panama-vector and run the arrays addition
benchmark another time:

AddBenchmark
 .scalarArrayArray            thrpt    5   6487636 ops/s
 .scalarArrayArrayLongStride  thrpt    5   1001515 ops/s
 .scalarSegmentArray          thrpt    5   1747531 ops/s
 .scalarSegmentSegment        thrpt    5   1154193 ops/s
 .scalarUnsafeArray           thrpt    5   6970073 ops/s
 .scalarUnsafeUnsafe          thrpt    5   1246625 ops/s
 .unrolledArrayArray          thrpt    5   1251824 ops/s
 .unrolledSegmentArray        thrpt    5   1694164 ops/s
 .unrolledUnsafeArray         thrpt    5   5043685 ops/s
 .unrolledUnsafeUnsafe        thrpt    5   1197024 ops/s
 .vectorArrayArray            thrpt    5   7200224 ops/s
 .vectorArraySegment          thrpt    5   7377553 ops/s
 .vectorSegmentArray          thrpt    5   7263505 ops/s
 .vectorSegmentSegment        thrpt    5   7143647 ops/s


   - Performance using the vector API is now very consistent and good
   across arrays and segments.
   - Reading and writing from/to segments still seems to be disrupting
   auto-vectorization. Reading with Unsafe works well but it's marked for
   removal.
   - Less important, manual unrolling also seems to be disrupting
   auto-vectorization.



Best,
-Antoine

On Tue, Mar 26, 2024 at 5:40 PM Vladimir Ivanov <
vladimir.x.ivanov at oracle.com> wrote:

>
> >> Personally, I prefer to see vectorizer handling "MoveX2Y (LoadX mem)"
> >> => "VectorReinterpret (LoadVector mem)" well and then introduce rules to
> >> strength-reduce it to mismatched access.
> >
> > Do I understand you right that you're saying the vector node for MoveL2D
> > (for instance) is VectorReinterpret so we could vectorize the code.
> >
> > Are you then suggesting that we can transform:
> >
> > (VectorReinterpret (LoadVector mem)
> >
> > into:
> >
> > (LoadVector mem)
> >
> > with that LoadVector a mismatched access?
>
> Yes, but thinking more about it, the latter step may be optional. For
> example, VectorReinterpret implementation on x86 is a no-op, so not much
> gained from folding VectorReinterpret+LoadVector into a mismatched
> LoadVector.
>
> Best regards,
> Vladimir Ivanov
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20240930/9661176e/attachment-0001.htm>


More information about the panama-dev mailing list