Integrated: 8292675: Add identity transformation for removing redundant AndV/OrV nodes
Bhavana Kilambi
bkilambi at openjdk.org
Fri Sep 9 14:28:49 UTC 2022
On Mon, 5 Sep 2022 10:21:11 GMT, Bhavana Kilambi <bkilambi at openjdk.org> wrote:
> Recently we found that the rotate left/right benchmarks with vectorapi
> emit a redundant "and" instruction on both aarch64 and x86_64 machines
> which can be done away with. For example - and(and(a, b), b) generates
> two "and" instructions which can be reduced to a single "and" operation-
> and(a, b) since "and" (and "or") operations are commutative and
> idempotent in nature. This can help improve performance for all those
> workloads which have multiple "and"/"or" operations with the same value
> by reducing them to fewer "and"/"or" operations accordingly.
>
> This patch adds the following transformations for vector logical
> operations - AndV and OrV :
>
>
> (OpV (OpV a b) b) => (OpV a b)
> (OpV (OpV a b) a) => (OpV a b)
> (OpV (OpV a b m1) b m1) => (OpV a b m1)
> (OpV (OpV a b m1) a m1) => (OpV a b m1)
> (OpV a (OpV a b)) => (OpV a b)
> (OpV b (OpV a b)) => (OpV a b)
> (OpV a (OpV a b m) m) => (OpV a b m)
>
> where Op = "And", "Or"
>
> Links for benchmarks tested are given below :-
> https://github.com/openjdk/panama-vector/blob/2aade73adeabdf6a924136b17fd96ccc95c1d160/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/IntMaxVector.java#L728
> https://github.com/openjdk/panama-vector/blob/2aade73adeabdf6a924136b17fd96ccc95c1d160/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/IntMaxVector.java#L764
> https://github.com/openjdk/panama-vector/blob/2aade73adeabdf6a924136b17fd96ccc95c1d160/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/LongMaxVector.java#L728
> https://github.com/openjdk/panama-vector/blob/2aade73adeabdf6a924136b17fd96ccc95c1d160/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/LongMaxVector.java#L764
>
> Before this patch, the disassembly for one these testcases
> (IntMaxVector.ROR) for Neon is shown below :
> ```
> ldr q16, [x12, #16]
> and v16.16b, v16.16b, v20.16b
> and v16.16b, v16.16b, v20.16b
> add x12, x16, x11
> sub v17.4s, v21.4s, v16.4s
> ...
> ...
>
>
> After this patch, the disassembly for the same testcase above is shown
> below :
>
> ldr q16, [x12, #16]
> and v16.16b, v16.16b, v20.16b
> add x12, x16, x11
> sub v17.4s, v21.4s, v16.4s
> ...
> ...
>
>
> The other tests also emit an extra "and" instruction as shown above for
> the vector ROR/ROL operations.
>
> Below are the performance results for the vectorapi rotate tests (tests
> given in the links above) with this patch on aarch64 and x86_64 machines
> (for int and long types) -
>
>
> Benchmark aarch64 x86_64
> IntMaxVector.ROL 25.57% 26.09%
> IntMaxVector.ROR 23.75% 24.15%
> LongMaxVector.ROL 28.91% 28.51%
> LongMaxVector.ROR 16.51% 29.11%
>
>
>
> The percentage indicates the percent gain/improvement in performance
> (ops/ms) with this patch over the master build without this patch. The
> machine descriptions are given below -
> aarch64 - 128-bit aarch64 machine
> x86_64 - 256-bit x86 machine
This pull request has now been integrated.
Changeset: 00befddd
Author: Bhavana Kilambi <bkilambi at openjdk.org>
Committer: Nick Gasson <ngasson at openjdk.org>
URL: https://git.openjdk.org/jdk/commit/00befddd7ce97d324250807824469daaa9434eef
Stats: 288 lines in 2 files changed: 286 ins; 0 del; 2 mod
8292675: Add identity transformation for removing redundant AndV/OrV nodes
Reviewed-by: thartmann, ngasson
-------------
PR: https://git.openjdk.org/jdk/pull/10163
More information about the hotspot-compiler-dev
mailing list