[vectorIntrinsics] RFR: 8283413: Add C2 mid-end and x86 back-end implementation for bit REVERSE operation [v2]
Jatin Bhateja
jbhateja at openjdk.java.net
Tue Mar 22 17:19:11 UTC 2022
On Tue, 22 Mar 2022 10:04:58 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:
> Since this is an in-lane shuffle, can we just use `vpshufb` for this? Thanks.
We will have to compose a different shuffle mask for each multibyte primitive type, given that shuffle operate at x86 lane level (128 bit) for 256 bit species we also may need to swap upper and lower lanes. Loading a shuffle mask will again consume cycles. Current 3 instruction sequence has 3 cycle latency.
-------------
PR: https://git.openjdk.java.net/panama-vector/pull/182
More information about the panama-dev
mailing list