[vector] Implement VectorShuffle rearrange and iota for AArch64 NEON
Yang Zhang (Arm Technology China)
Yang.Zhang at arm.com
Sun Sep 29 06:29:42 UTC 2019
Hi all
Could anyone please help to review this patch?
Webrev: http://cr.openjdk.java.net/~yzhang/vectorapi/vectorapi.shuffle/webrev.00/
In this patch, VectorLoadConst, VectorLoadShuffle and VectorRearrange rules are supported.
Here is an example that rearranges a NEON vector with 4 ints:
Rearrange V1 int[a0, a1, a2, a3] to V2 int[a2, a3, a0, a1]
1. Get the indices of V1 and store them as Vi byte[0, 1, 2, 3].
2. Convert Vi byte[0, 1, 2, 3] to the indices of V2 and also store them as Vi byte[2, 3, 0, 1].
3. Unsigned extend Long Vi from byte[2, 3, 0, 1] to int[2, 3, 0, 1].
4. Multiply Vi int[2, 3, 0, 1] with constant int[0x04040404, 0x04040404, 0x04040404, 0x04040404]
and get tbl base Vm int[0x08080808, 0x0c0c0c0c, 0x00000000, 0x04040404].
5. Add Vm with constant int[0x03020100, 0x03020100, 0x03020100, 0x03020100]
and get tbl index Vm int[0x0b0a0908, 0x0f0e0d0c, 0x03020100, 0x07060504]
6. Use Vm as index register, and use V1 as table register.
Then get V2 as the result by tbl NEON instructions.
Notes:
Step 1 matches VectorLoadConst.
Step 3 matches VectorLoadShuffle.
Step 4, 5, 6 match VectorRearrange.
For VectorRearrange short/int, the reason why such complex calculation is
required is because NEON tbl supports bytes table only, so for short/int, we
need to lookup 2/4 bytes as a group. For VectorRearrange long, we use bsl
to implement rearrange.
Regards,
Yang
More information about the panama-dev
mailing list