Integrated: 8282926: AArch64: Optimize out WHILELO with PTRUE
Eric Liu
eliu at openjdk.java.net
Wed Mar 30 15:02:38 UTC 2022
On Fri, 11 Mar 2022 10:40:00 GMT, Eric Liu <eliu at openjdk.org> wrote:
> This patch uses PTRUE instruction instead of WHILELO instruction to
> create vector masks for certain length. It would be more efficient than
> WHILELO instruction according to the software optimization guide of
> Neoverse N2[1], Neoverse V1[2], and A64FX[3].
>
> The final code changes as shown below:
>
> Before:
>
> 0x0000ffff6d4747b4: orr x8, xzr, #0x10
> 0x0000ffff6d4747b8: whilelo p0.b, xzr, x8
>
> After:
>
> 0x0000ffff89476aec: ptrue p0.b, vl16
>
> The micro benchmark improves 15% ~ 20% in my SVE test system.
>
> [TEST]
> jdk/incubator/vector, hotspot/compiler/vectorapi passed on my SVE test
> machine.
>
> [1] https://developer.arm.com/documentation/PJDOC-466751330-18256/0001
> [2] https://developer.arm.com/documentation/pjdoc466751330-9685/latest/
> [3] https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Microarchitecture_Manual_en_1.6.pdf
This pull request has now been integrated.
Changeset: e8e9b8dc
Author: Eric Liu <eliu at openjdk.org>
Committer: Nick Gasson <ngasson at openjdk.org>
URL: https://git.openjdk.java.net/jdk/commit/e8e9b8dc89059b606d16ba950f0d7e57185151e7
Stats: 174 lines in 3 files changed: 0 ins; 71 del; 103 mod
8282926: AArch64: Optimize out WHILELO with PTRUE
Reviewed-by: njian, ngasson
-------------
PR: https://git.openjdk.java.net/jdk/pull/7786
More information about the hotspot-compiler-dev
mailing list