RFR: 8308340: C2: Idealize Fma nodes [v5]
Fei Gao
fgao at openjdk.org
Mon Aug 14 07:42:33 UTC 2023
On Thu, 10 Aug 2023 13:16:04 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> Fei Gao has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits:
>>
>> - Merge branch 'master' into fg8308340
>> - Merge branch 'master' into fg8308340
>> - Merge branch 'master' into fg8308340
>> - Move check for UseFMA from c2compiler.cpp to Matcher::match_rule_supported in .ad files
>> - Merge branch 'master' into fg8308340
>> - 8308340: C2: Idealize Fma nodes
>>
>> Some platforms, like aarch64, ppc, and riscv, support fusing
>> `Math.fma(-a, b, c)` or `Math.fma(a, -b, c)` by generating
>> partially symmetric match rules like:
>>
>> ```
>> match(Set dst (FmaF src3 (Binary (NegF src1) src2)));
>> match(Set dst (FmaF src3 (Binary src1 (NegF src2))));
>> ```
>>
>> Since `Fma` is partially communitive, the patch is to convert
>> `Math.fma(-a, b, c)` to `Math.fma(b, -a, c)` in gvn phase,
>> making node patterns canonical. Then we can remove redundant
>> rules.
>>
>> Also, we should guarantee that C2 generates `Fma` nodes only on
>> platforms supporting `Fma` instructions before matcher, so we
>> can remove all `predicate(UseFMA)` for all `Fma` rules.
>>
>> After the patch, the code size of libjvm.so on aarch64 platform
>> decreased by 63.4k.
>>
>> The patch passed all tier 1 - 3 on aarch64 and x86 platforms.
>
> src/hotspot/share/opto/mulnode.cpp line 1717:
>
>> 1715: //------------------------------Ideal------------------------------------------
>> 1716: Node* FmaNode::Ideal(PhaseGVN* phase, bool can_reshape) {
>> 1717: // We canonicalize the node by converting "(-a)*b+c" into "b*(-a)+c"
>
> Add motivation to comment
>
>
> // This reduces the number of rules in the matcher, as we only need to check
> // for negations on the second argument, and not the symmetric case where
> // the first argument is negated.
Thanks! Done.
> test/hotspot/jtreg/compiler/vectorapi/VectorFusedMultiplyAddSubTest.java line 63:
>
>> 61: private static final VectorSpecies<Short> S_SPECIES = ShortVector.SPECIES_MAX;
>> 62:
>> 63: private static int LENGTH = 128;
>
> What is the reason for the reduction? Speed?
Yes, it's for speeding up.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/14576#discussion_r1293070063
PR Review Comment: https://git.openjdk.org/jdk/pull/14576#discussion_r1293069544
More information about the hotspot-compiler-dev
mailing list