RFR: 8354242: VectorAPI: combine vector not operation with compare [v2]
Jatin Bhateja
jbhateja at openjdk.org
Wed May 7 11:09:16 UTC 2025
On Fri, 25 Apr 2025 09:17:02 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> Thanks for telling me this information. Another more important reason to check outcnt here is to prevent this optimization when the uses of VectorMaskCmp is greater than 1, because this optimization may not be worthwhile. For example:
>>
>>
>> public static void testVectorMaskCmp() {
>> IntVector bv = IntVector.fromArray(I_SPECIES, ib, 0);
>> IntVector av = IntVector.fromArray(I_SPECIES, ia, 0);
>> VectorMask<Integer> m1 = av.compare(VectorOperators.NE, bv); // two uses
>> VectorMask<Integer> m2 =m1.not();
>> m1.intoArray(m, 0);
>> av.lanewise(VectorOperators.ABS, m2).intoArray(ia, 0);
>> }
>>
>>
>> If we do not check outcnt and still do this optimization, two VectorMaskCmp nodes will be generated, and finally two VectorMaskCmp instructions will be generated. This is unreasonable because VectorMaskCmp has much higher latency than xor instruction on aarch64.
>
> Thanks, we can add this comment to the code where we are checking outcnt. What if all the other users are also XorNodes?.
At present, you are checking for one XOR user; shouldn't it be all or one scenario?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/24674#discussion_r2077378879
More information about the core-libs-dev
mailing list