RFR: 8308340: C2: Idealize Fma nodes [v5]

Emanuel Peter epeter at openjdk.org
Thu Aug 10 13:39:28 UTC 2023


On Mon, 24 Jul 2023 04:12:02 GMT, Fei Gao <fgao at openjdk.org> wrote:

>> Some platforms, like aarch64, ppc, and riscv, support fusing `Math.fma(-a, b, c)` or `Math.fma(a, -b, c)` by generating partially symmetric match rules like:
>> 
>> 
>>   match(Set dst (FmaF src3 (Binary (NegF src1) src2)));
>>   match(Set dst (FmaF src3 (Binary src1 (NegF src2))));
>> 
>> 
>> Since `Fma` is partially commutative, the patch is to convert `Math.fma(-a, b, c)` to `Math.fma(b, -a, c)` in gvn phase, making node patterns canonical. Then we can remove redundant rules.
>> 
>> Also, we should guarantee that C2 generates `Fma` nodes only on platforms supporting `Fma` instructions before matcher, so we can remove all `predicate(UseFMA)` for all `Fma` rules.
>> 
>> After the patch, the code size of libjvm.so on aarch64 platform decreased by 63.4k.
>> 
>> The patch passed all tier 1 - 3 on aarch64 and x86 platforms.
>
> Fei Gao has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits:
> 
>  - Merge branch 'master' into fg8308340
>  - Merge branch 'master' into fg8308340
>  - Merge branch 'master' into fg8308340
>  - Move check for UseFMA from c2compiler.cpp to Matcher::match_rule_supported in .ad files
>  - Merge branch 'master' into fg8308340
>  - 8308340: C2: Idealize Fma nodes
>    
>    Some platforms, like aarch64, ppc, and riscv, support fusing
>    `Math.fma(-a, b, c)` or `Math.fma(a, -b, c)` by generating
>    partially symmetric match rules like:
>    
>    ```
>      match(Set dst (FmaF src3 (Binary (NegF src1) src2)));
>      match(Set dst (FmaF src3 (Binary src1 (NegF src2))));
>    ```
>    
>    Since `Fma` is partially communitive, the patch is to convert
>    `Math.fma(-a, b, c)` to `Math.fma(b, -a, c)` in gvn phase,
>    making node patterns canonical. Then we can remove redundant
>    rules.
>    
>    Also, we should guarantee that C2 generates `Fma` nodes only on
>    platforms supporting `Fma` instructions before matcher, so we
>    can remove all `predicate(UseFMA)` for all `Fma` rules.
>    
>    After the patch, the code size of libjvm.so on aarch64 platform
>    decreased by 63.4k.
>    
>    The patch passed all tier 1 - 3 on aarch64 and x86 platforms.

@fg1417 This looks like a reasonable refactoring.

We should probably do the verification that the canonicalization happened, if the normal `fma` matcher rule is chosen. We should add asserts that the first argument is not a negation (you could check the second argument also, just in case). What do you think?

src/hotspot/cpu/x86/x86.ad line 3975:

> 3973: // a * b + c
> 3974: instruct fmaF_reg(regF a, regF b, regF c) %{
> 3975:   predicate(UseFMA);

You could add an assert to the encoding code. Just to ensure that we do not generate bad code, even if it is never executed during testing.

src/hotspot/share/opto/mulnode.cpp line 1717:

> 1715: //------------------------------Ideal------------------------------------------
> 1716: Node* FmaNode::Ideal(PhaseGVN* phase, bool can_reshape) {
> 1717:   // We canonicalize the node by converting "(-a)*b+c" into "b*(-a)+c"

Add motivation to comment


// This reduces the number of rules in the matcher, as we only need to check
// for negations on the second argument, and not the symmetric case where
// the first argument is negated.

test/hotspot/jtreg/compiler/vectorapi/VectorFusedMultiplyAddSubTest.java line 63:

> 61:     private static final VectorSpecies<Short> S_SPECIES = ShortVector.SPECIES_MAX;
> 62: 
> 63:     private static int LENGTH = 128;

What is the reason for the reduction? Speed?

-------------

PR Review: https://git.openjdk.org/jdk/pull/14576#pullrequestreview-1571816582
PR Review Comment: https://git.openjdk.org/jdk/pull/14576#discussion_r1290116448
PR Review Comment: https://git.openjdk.org/jdk/pull/14576#discussion_r1290112263
PR Review Comment: https://git.openjdk.org/jdk/pull/14576#discussion_r1290076923


More information about the hotspot-compiler-dev mailing list