RFR: 8338023: Support two vector selectFrom API [v6]

Paul Sandoz psandoz at openjdk.org
Tue Aug 27 20:03:07 UTC 2024


On Tue, 27 Aug 2024 09:58:44 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Hi All,
>> 
>> As per the discussion on panama-dev mailing list[1], patch adds the support for following new two vector permutation APIs.
>> 
>> 
>> Declaration:-
>>     Vector<E>.selectFrom(Vector<E> v1, Vector<E> v2)
>> 
>> 
>> Semantics:-
>>     Using index values stored in the lanes of "this" vector, assemble the values stored in first (v1) and second (v2) vector arguments. Thus, first and second vector serves as a table, whose elements are selected based on index value vector. API is applicable to all integral and floating-point types.  The result of this operation is semantically equivalent to expression v1.rearrange(this.toShuffle(), v2). Values held in index vector lanes must lie within valid two vector index range [0, 2*VLEN) else an IndexOutOfBoundException is thrown.  
>> 
>> Summary of changes:
>> -  Java side implementation of new selectFrom API.
>> -  C2 compiler IR and inline expander changes.
>> -  In absence of direct two vector permutation instruction in target ISA, a lowering transformation dismantles new IR into constituent IR supported by target platforms. 
>> -  Optimized x86 backend implementation for AVX512 and legacy target.
>> -  Function tests covering new API.
>> 
>> JMH micro included with this patch shows around 10-15x gain over existing rearrange API :-
>> Test System: Intel(R) Xeon(R) Platinum 8480+ [ Sapphire Rapids Server]
>> 
>> 
>>   Benchmark                                     (size)   Mode  Cnt      Score   Error   Units
>> SelectFromBenchmark.rearrangeFromByteVector     1024  thrpt    2   2041.762          ops/ms
>> SelectFromBenchmark.rearrangeFromByteVector     2048  thrpt    2   1028.550          ops/ms
>> SelectFromBenchmark.rearrangeFromIntVector      1024  thrpt    2    962.605          ops/ms
>> SelectFromBenchmark.rearrangeFromIntVector      2048  thrpt    2    479.004          ops/ms
>> SelectFromBenchmark.rearrangeFromLongVector     1024  thrpt    2    359.758          ops/ms
>> SelectFromBenchmark.rearrangeFromLongVector     2048  thrpt    2    178.192          ops/ms
>> SelectFromBenchmark.rearrangeFromShortVector    1024  thrpt    2   1463.459          ops/ms
>> SelectFromBenchmark.rearrangeFromShortVector    2048  thrpt    2    727.556          ops/ms
>> SelectFromBenchmark.selectFromByteVector        1024  thrpt    2  33254.830          ops/ms
>> SelectFromBenchmark.selectFromByteVector        2048  thrpt    2  17313.174          ops/ms
>> SelectFromBenchmark.selectFromIntVector         1024  thrpt    2  10756.804          ops/ms
>> S...
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Review comments resolutions.

I think we should leave the fallback expression as `vec2.rearrange(vec1.toShuffle(), vec3);`, lets address that separately if needed. Otherwise, you have introduced an additional code path that requires more explicit testing.

My comment was related to understanding what `SelectFromTwoVectorNode::Ideal` and `VectorRearrangeNode::Ideal` are doing - the former lowers, if needed, into the rearrange expression and the latter adjusts, if needed, the index vector (a comment describing this transformation would be useful, like you have in the former method).

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20508#issuecomment-2313401788


More information about the core-libs-dev mailing list