RFR: 8354242: VectorAPI: combine vector not operation with compare [v2]
Jatin Bhateja
jbhateja at openjdk.org
Fri Apr 25 09:20:05 UTC 2025
On Thu, 24 Apr 2025 09:37:07 GMT, erifan <duke at openjdk.org> wrote:
>> src/hotspot/share/opto/vectornode.cpp line 2243:
>>
>>> 2241: in1 = in1->in(1);
>>> 2242: }
>>> 2243: if (in1->Opcode() != Op_VectorMaskCmp || in1->outcnt() > 1 ||
>>
>> Checks on outcnt on line 2243 and 2238 can be removed. Idealization looks for a specific graph palette and replaces it with a new node whose inputs are the same as the inputs of the palette. GVN will do the retention job if any intermediate node has users beyond the pattern being replaced.
>
> Thanks for telling me this information. Another more important reason to check outcnt here is to prevent this optimization when the uses of VectorMaskCmp is greater than 1, because this optimization may not be worthwhile. For example:
>
>
> public static void testVectorMaskCmp() {
> IntVector bv = IntVector.fromArray(I_SPECIES, ib, 0);
> IntVector av = IntVector.fromArray(I_SPECIES, ia, 0);
> VectorMask<Integer> m1 = av.compare(VectorOperators.NE, bv); // two uses
> VectorMask<Integer> m2 =m1.not();
> m1.intoArray(m, 0);
> av.lanewise(VectorOperators.ABS, m2).intoArray(ia, 0);
> }
>
>
> If we do not check outcnt and still do this optimization, two VectorMaskCmp nodes will be generated, and finally two VectorMaskCmp instructions will be generated. This is unreasonable because VectorMaskCmp has much higher latency than xor instruction on aarch64.
Thanks, we can add this comment to the code where we are checking outcnt. What if all the other users are also XorNodes?.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/24674#discussion_r2059874975
More information about the hotspot-compiler-dev
mailing list