Array addition and array sum Panama benchmarks

Roland Westrelin rwestrel at redhat.com
Thu Mar 21 14:17:57 UTC 2024


> 1. disjointness analysis doesn't work for all heap, which is a known
> issue

Emanuel who's been reworking autovectorization support mentioned it's
covered by:

https://bugs.openjdk.org/browse/JDK-8324751

This one:
https://bugs.openjdk.org/browse/JDK-8317424

list all improvements to autovectorization being considered.

> The fix I came up with yesterday seems a reasonable stop-gap solution 
> for (2): if the memory var handle is fully aligned, and its endianness 
> is == platform endianness, then don't bother with the long -> double 
> trip and just use Unsafe::getDouble. That said, this fix will only work 
> under these conditions (aligned _plain_ access with right endianness). 
> Anything else will fall back to the old pattern. This tweak shouldn't 
> cost anything, as these conditions are invariants for a given var handle 
> instance (whose final fields are trusted, as defined in 
> "java.lang.invoke"), which is typically held in a static final field, so 
> everything should be known to the JIT. If we want to address that at the 
> vectorizer level, it will probably require deeper changes which treat 
> the Unsafe.getLong + Long.longBitsToDouble as a single operation.

One solution would be for c2 to transform the long memory load + long to
to double move into double memory load (an Ideal transformation). The
code would then vectorize with no change to the vectorizer
required. That seems fairly straightforward as a change.

ROland.



More information about the panama-dev mailing list