[vectorIntrinsics] RFR: Improve mask reduction operations on AVX [v3]

Mai Đặng Quân Anh duke at openjdk.java.net
Tue Nov 9 16:33:46 UTC 2021


On Tue, 9 Nov 2021 16:14:26 GMT, Mai Đặng Quân Anh <duke at openjdk.java.net> wrote:

>> Hi,
>> This patch improves the logic of vector mask reduction operations on AVX, especially int, float, long, double, by using vmovmskpd and vmovmskps instructions. I also do a little refactoring to reduce duplication in toLong. The patch temporarily disables these operations on Neon, though.
>> Thank you very much.
>
> Mai Đặng Quân Anh has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - support for non-bmi, some refinement
>  - restore VectorStoreMaskNode, move logic to backend

The latest change reverts the change in the mid-end and lets the x86 back-end perform the elision when matching. Some minor changes in the C2_MacroAssembler were made to improve the code emitted by first true operations, as well as to reduce the width of instructions that operate on index calculations from 64 bits to 32 bits.
Thank you very much.

-------------

PR: https://git.openjdk.java.net/panama-vector/pull/158


More information about the panama-dev mailing list