RFR: 8338023: Support two vector selectFrom API

Jatin Bhateja jbhateja at openjdk.org
Thu Aug 8 17:02:05 UTC 2024


Hi All,

As per the discussion on panama-dev mailing list[1], patch adds the support for following new two vector permutation APIs.


Declaration:-
    Vector<E>.selectFrom(Vector<E> v1, Vector<E> v2)


Semantics:-
    Using index values stored in the lanes of "this" vector, assemble the values stored in first (v1) and second (v2) vector arguments. Thus, first and second vector serves as a table, whose elements are selected based on index value vector. API is applicable to all integral and floating-point types.  The result of this operation is semantically equivalent to expression v1.rearrange(this.toShuffle(), v2). Values held in index vector lanes must lie within valid two vector index range [0, 2*VLEN) else an IndexOutOfBoundException is thrown.  

Summary of changes:
-  Java side implementation of new selectFrom API.
-  C2 compiler IR and inline expander changes.
-  In absence of direct two vector permutation instruction in target ISA, a lowering transformation dismantles new IR into constituent IR supported by target platforms. 
-  Optimized x86 backend implementation for AVX512 and legacy target.
-  Function tests covering new API.

JMH micro included with this patch shows around 10-15x gain over existing rearrange API :-
Test System: Intel(R) Xeon(R) Platinum 8480+ [ Sapphire Rapids Server]


  Benchmark                                     (size)   Mode  Cnt      Score   Error   Units
SelectFromBenchmark.rearrangeFromByteVector     1024  thrpt    2   2041.762          ops/ms
SelectFromBenchmark.rearrangeFromByteVector     2048  thrpt    2   1028.550          ops/ms
SelectFromBenchmark.rearrangeFromIntVector      1024  thrpt    2    962.605          ops/ms
SelectFromBenchmark.rearrangeFromIntVector      2048  thrpt    2    479.004          ops/ms
SelectFromBenchmark.rearrangeFromLongVector     1024  thrpt    2    359.758          ops/ms
SelectFromBenchmark.rearrangeFromLongVector     2048  thrpt    2    178.192          ops/ms
SelectFromBenchmark.rearrangeFromShortVector    1024  thrpt    2   1463.459          ops/ms
SelectFromBenchmark.rearrangeFromShortVector    2048  thrpt    2    727.556          ops/ms
SelectFromBenchmark.selectFromByteVector        1024  thrpt    2  33254.830          ops/ms
SelectFromBenchmark.selectFromByteVector        2048  thrpt    2  17313.174          ops/ms
SelectFromBenchmark.selectFromIntVector         1024  thrpt    2  10756.804          ops/ms
SelectFromBenchmark.selectFromIntVector         2048  thrpt    2   5398.244          ops/ms
SelectFromBenchmark.selectFromLongVector        1024  thrpt    2   5856.859          ops/ms
SelectFromBenchmark.selectFromLongVector        2048  thrpt    2   1513.378          ops/ms
SelectFromBenchmark.selectFromShortVector       1024  thrpt    2  17888.617          ops/ms
SelectFromBenchmark.selectFromShortVector       2048  thrpt    2   9079.565          ops/ms


Kindly review and share your feedback.

Best Regards,
Jatin

[1] https://mail.openjdk.org/pipermail/panama-dev/2024-May/020408.html

-------------

Commit messages:
 - Adding Benchmark
 - 8338023: Support two vector selectFrom API

Changes: https://git.openjdk.org/jdk/pull/20508/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20508&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8338023
  Stats: 2737 lines in 95 files changed: 2719 ins; 17 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/20508.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20508/head:pull/20508

PR: https://git.openjdk.org/jdk/pull/20508


More information about the core-libs-dev mailing list