RFR: 8348868: AArch64: Add backend support for SelectFromTwoVector [v2]

Xiaohong Gong xgong at openjdk.org
Fri Jun 13 15:20:59 UTC 2025


On Tue, 3 Jun 2025 08:25:43 GMT, Bhavana Kilambi <bkilambi at openjdk.org> wrote:

>> Hi @Bhavana-Kilambi , I'v created a new PR https://github.com/openjdk/jdk/pull/23790 to implement the `VectorRearrange` for small lane count vector types like `2D`. I think the implementation is quite same with what we discussed here.  Any feedback please let me know. Thanks!
>
> Hi @XiaohongGong , I just got back to working on this PR again!
> I have been trying to implement this operation for Doubles/Longs but the performance is 0.8x that of the default implementation (with two vector rearranges and a vector blend). The implementation using `bsl` that I used is given below - 
> 
> 
>     dup(tmp1, T2D, src1, 0);
>     dup(tmp2, T2D, src1, 1);
> 
>     mov(tmp3, T2D, 0x01);
>     andr(tmp4, T16B, index, tmp3);
>     negr(tmp4, T2D, tmp4);
>     orr(tmp5, T16B, tmp4, tmp4);
> 
>     bsl(tmp4, T16B, tmp2, tmp1);
> 
>     dup(tmp1, T2D, src2, 0);
>     dup(tmp2, T2D, src2, 1);
> 
>     bsl(tmp5, T16B, tmp2, tmp1);
> 
>     sshr(dst, T2D, index, 1);
>     andr(dst, T16B, dst, tmp3);
>     negr(dst, T2D, dst);
> 
>     bsl(dst, T16B, tmp5, tmp4);
> 
> 
> 
> This is based on the fact that the index vector can only contain values = 0 to 3. If the first bit is 0/1 it refers to the first or second double/long and if the second bit is 0/1 it selects the source (either src1/src2). 
> index =  00 -> choose first double/long of src1
>               01 -> choose second double/long of src1
>               10 -> choose first double/long of src2
>               11 -> choose second double/long of src2
>               
> I am not able to avoid duplicating the source elements. 
> Would it be ok if I do not support SelectFromTwoVector for doubles/longs or do you have any suggestion on how I can improve my implementation?

Oh, I forgot that we have the `blend + rearrange` pattern if this op is not supported directly. Since `VectorRearrange` for 2D have been implemented now, did you check the final codegen of the default pattern? I think we can revisit the codegen first with the default pattern (i.e. `VectorBlend + VectorRearrange + VectorRearrange`), and find whether there is further improvement opportunity for that.  If so, we can implement the `SelectFromTwoVectors` op directly based on the improvement point. Otherwise, just keep using the default pattern will be fine to me.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23570#discussion_r2123214296


More information about the hotspot-compiler-dev mailing list