[aarch64-port-dev ] RFR(S): 8239549: AArch64: Backend support for MulAddVS2VI node
Andrew Haley
aph at redhat.com
Mon Feb 24 13:57:21 UTC 2020
Hi,
On 2/24/20 9:43 AM, Pengfei Li wrote:
> The value of VM flag "AlignVector" is changed in this patch for the
> following reason. As "vector memory operations could be misaligned when
> accesses to arrays of different types are vectorized in one loop" [3],
> current C2 superword.cpp doesn't vectorize this kind of loops unless
> it's guaranteed that the unaligned loads/stores won't bring performance
> penalties. Hence, the x86 backend set "AlignVector" to the opposite of
> another x86 flag "UseUnalignedLoadStores", which indicates whether x86
> instruction MOVDQU could be used to load/store unaligned memories. In
> AArch64, we have a flag "AvoidUnalignedAccesses" indicating if we need
> to avoid unaligned loads/stores on current AArch64 micro-architecture.
> So we assign "AlignVector" from this.
>
> [Tests]
> Jtreg: hotspot::hotspot_all_no_apps, jdk::jdk_core and langtools::tier1.
> No new failure found.
>
> JMH: Derived a JMH case [4] from the jtreg in the x86 patch [1].
> Before
> Benchmark Mode Cnt Score Error Units
> TestSIMDMulAddS2I.testMulAddS2I avgt 15 260.827 ± 13.864 us/op
> After
> Benchmark Mode Cnt Score Error Units
> TestSIMDMulAddS2I.testMulAddS2I avgt 15 48.297 ± 0.149 us/op
>
> [1] http://hg.openjdk.java.net/jdk/jdk/rev/4bb6e0871bf7
> [2] https://en.wikichip.org/wiki/x86/avx512vnni
> [3] https://bugs.openjdk.java.net/browse/JDK-7199010
> [4] http://cr.openjdk.java.net/~pli/rfr/8239549/TestSIMDMulAddS2I.java
Seems to work, although I'm not seeing the dramatic speedup that you are.
Before:
Benchmark Mode Cnt Score Error Units
TestSIMDMulAddS2I.testMulAddS2I avgt 8 164.148 ± 0.490 us/op
After:
Benchmark Mode Cnt Score Error Units
TestSIMDMulAddS2I.testMulAddS2I avgt 8 71.981 ± 0.280 us/op
I guess my processor is both slower and faster than yours. :-)
Approved.
--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the aarch64-port-dev
mailing list