RFR: 8258932: AArch64: Enhance floating-point Min/MaxReductionV with fminp/fmaxp [v2]
Andrew Haley
aph at openjdk.java.net
Tue Jan 5 11:21:57 UTC 2021
On Tue, 5 Jan 2021 03:43:15 GMT, Dong Bo <dongbo at openjdk.org> wrote:
>> This patch optimizes vectorial Min/Max reduction of two floating-point numbers on aarch64 with NEON instructions `fmaxp` and `fminp`.
>>
>> Passed jtreg tier1-3 tests with `linux-aarch64-server-fastdebug` build.
>> Tests under `test/jdk/jdk/incubator/vector/` runned specially for the correctness and passed.
>>
>> Introduced a new JMH micro `test/micro/org/openjdk/bench/vm/compiler/VectorReductionFloatingMinMax.java` for performance test.
>> Witnessed abount `37%` performance improvements on Kunpeng916. The JMH Results:
>> Benchmark (COUNT) (seed) Mode Cnt Score Error Units
>> # Kunpeng 916, default
>> VectorReduction.maxRedD 512 0 avgt 10 678.126 ± 0.815 ns/op
>> VectorReduction.maxRedF 512 0 avgt 10 242.958 ± 0.212 ns/op
>> VectorReduction.minRedD 512 0 avgt 10 678.554 ± 0.824 ns/op
>> VectorReduction.minRedF 512 0 avgt 10 243.368 ± 0.205 ns/op
>>
>> # Kunpeng 916, with fmaxp/fminp
>> VectorReduction.maxRedD 512 0 avgt 10 430.201 ± 0.353 ns/op => 36.56%
>> VectorReduction.maxRedF 512 0 avgt 10 243.404 ± 0.297 ns/op
>> VectorReduction.minRedD 512 0 avgt 10 427.805 ± 0.528 ns/op => 36.89%
>> VectorReduction.minRedF 512 0 avgt 10 242.963 ± 0.210 ns/op
>
> Dong Bo has updated the pull request incrementally with one additional commit since the last revision:
>
> rebase the .m4 entry
This all looks reasonable. With the assembler change I think we should be good to go.
src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 2451:
> 2449: #undef INSN
> 2450:
> 2451: #define INSN(NAME, opc) \
This should be in an "AdvSIMD scalar pairwise" instruction group with faddp.
-------------
PR: https://git.openjdk.java.net/jdk/pull/1925
More information about the hotspot-compiler-dev
mailing list