[vector] Implement VectorShuffle rearrange and iota for AArch64 NEON

Yang Zhang (Arm Technology China) Yang.Zhang at arm.com
Sun Sep 29 06:29:42 UTC 2019


Hi all

Could anyone please help to review this patch?

Webrev: http://cr.openjdk.java.net/~yzhang/vectorapi/vectorapi.shuffle/webrev.00/

In this patch, VectorLoadConst, VectorLoadShuffle and VectorRearrange rules are supported.
Here is an example that rearranges a NEON vector with 4 ints:
Rearrange V1 int[a0, a1, a2, a3] to V2 int[a2, a3, a0, a1]
   1. Get the indices of V1 and store them as Vi byte[0, 1, 2, 3].
   2. Convert Vi byte[0, 1, 2, 3] to the indices of V2 and also store them as Vi byte[2, 3, 0, 1].
   3. Unsigned extend Long Vi from byte[2, 3, 0, 1] to int[2, 3, 0, 1].
   4. Multiply Vi int[2, 3, 0, 1] with constant int[0x04040404, 0x04040404, 0x04040404, 0x04040404]
      and get tbl base Vm int[0x08080808, 0x0c0c0c0c, 0x00000000, 0x04040404].
   5. Add Vm with constant int[0x03020100, 0x03020100, 0x03020100, 0x03020100]
      and get tbl index Vm int[0x0b0a0908, 0x0f0e0d0c, 0x03020100, 0x07060504]
   6. Use Vm as index register, and use V1 as table register.
      Then get V2 as the result by tbl NEON instructions.
 Notes:
   Step 1 matches VectorLoadConst.
   Step 3 matches VectorLoadShuffle.
   Step 4, 5, 6 match VectorRearrange.
   For VectorRearrange short/int, the reason why such complex calculation is
   required is because NEON tbl supports bytes table only, so for short/int, we
   need to lookup 2/4 bytes as a group. For VectorRearrange long, we use bsl
   to implement rearrange.

Regards,
Yang


More information about the panama-dev mailing list