RFR: 8308340: C2: Idealize Fma nodes [v3]
Fei Gao
fgao at openjdk.org
Wed Jun 28 08:17:37 UTC 2023
> Some platforms, like aarch64, ppc, and riscv, support fusing `Math.fma(-a, b, c)` or `Math.fma(a, -b, c)` by generating partially symmetric match rules like:
>
>
> match(Set dst (FmaF src3 (Binary (NegF src1) src2)));
> match(Set dst (FmaF src3 (Binary src1 (NegF src2))));
>
>
> Since `Fma` is partially commutative, the patch is to convert `Math.fma(-a, b, c)` to `Math.fma(b, -a, c)` in gvn phase, making node patterns canonical. Then we can remove redundant rules.
>
> Also, we should guarantee that C2 generates `Fma` nodes only on platforms supporting `Fma` instructions before matcher, so we can remove all `predicate(UseFMA)` for all `Fma` rules.
>
> After the patch, the code size of libjvm.so on aarch64 platform decreased by 63.4k.
>
> The patch passed all tier 1 - 3 on aarch64 and x86 platforms.
Fei Gao has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
- Merge branch 'master' into fg8308340
- Move check for UseFMA from c2compiler.cpp to Matcher::match_rule_supported in .ad files
- Merge branch 'master' into fg8308340
- 8308340: C2: Idealize Fma nodes
Some platforms, like aarch64, ppc, and riscv, support fusing
`Math.fma(-a, b, c)` or `Math.fma(a, -b, c)` by generating
partially symmetric match rules like:
```
match(Set dst (FmaF src3 (Binary (NegF src1) src2)));
match(Set dst (FmaF src3 (Binary src1 (NegF src2))));
```
Since `Fma` is partially communitive, the patch is to convert
`Math.fma(-a, b, c)` to `Math.fma(b, -a, c)` in gvn phase,
making node patterns canonical. Then we can remove redundant
rules.
Also, we should guarantee that C2 generates `Fma` nodes only on
platforms supporting `Fma` instructions before matcher, so we
can remove all `predicate(UseFMA)` for all `Fma` rules.
After the patch, the code size of libjvm.so on aarch64 platform
decreased by 63.4k.
The patch passed all tier 1 - 3 on aarch64 and x86 platforms.
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/14576/files
- new: https://git.openjdk.org/jdk/pull/14576/files/a22814d8..06162d88
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=14576&range=02
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=14576&range=01-02
Stats: 7098 lines in 250 files changed: 2631 ins; 2299 del; 2168 mod
Patch: https://git.openjdk.org/jdk/pull/14576.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/14576/head:pull/14576
PR: https://git.openjdk.org/jdk/pull/14576
More information about the hotspot-compiler-dev
mailing list