[vectorIntrinsics+compress] RFR: 8276083: Incremental patch to further optimize new compress/expand APIs over X86 [v2]

Jatin Bhateja jbhateja at openjdk.java.net
Thu Oct 28 19:05:53 UTC 2021


> Summary of changes:
> 1)  Added both scalar and vector variants of JMH performance tests for Vector.compress/expand and VectorMask.compress APIs.
> 
> 2) Improved performance of operations where mask length is less than 4. Mask loading is a two stage process where in first the boolean array is loaded into a vector and then either transferred to a predicate register or a vector whose size is equivalent to that of underlined SPECIES.  A mask whose length is less than 4 will result into a less than 32 bit vector load operation.  Operations dependent on smaller masks are now being handled in java side implementation of these and some other APIs.  Since the condition for special handling and fallback logic leading to C2 intrinsic call is based on constant expression hence one of the control path is optimized out. This shall also prevent any performance penalty due to failed lazy inline expansion which most often occurs due to unsupported vector sizes.  If lazy inline expansion fails then C2 emits a direct call instruction to a callee method and thus we also loose any opportunity for procedure in-lining at that point, a separate [issue
  ](https://bugs.openjdk.java.net/browse/JDK-8276085)has been created to address this problem.
>  
> 3) Improved performance of VectorMask.compress over legacy non-AVX512 targets, added the missing checks in C2Compiler::is_intrinsics_supported routine to enable procedure in-lining early during parsing if target does not support direct compress/expand instructions.  
>     
> 4) Inline expand VectorMask.intoArray operation to trigger boxing-unboxing optimization. This significantly improved the performance of VectorMask.compress in newly added JMH micros.
> 
> Following is the performance data for included JMH micros:
> System Configuration:  Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server 40C 2S)
> 
> A) VectorMask.compress:
> 
> ![image](https://user-images.githubusercontent.com/59989778/139234423-5c340da1-2fb1-4a21-b6d1-bb8b123b0a55.png)
> 
> B) Vector.compress:
> 
> ![image](https://user-images.githubusercontent.com/59989778/139234843-c3846b8d-dff0-4e9e-bf29-6b4dc556cd93.png)
> 
> C) Vector.expand:
> 
> ![image](https://user-images.githubusercontent.com/59989778/139234956-efad1c71-8425-4300-a8c6-c660855f5c58.png)
> 
> Patch has been regressed using tier3 regressions at various AVX levels 0/1/2/3/KNL.
> 
> Kindly review  and share your feedback.
> 
> Best Regards,
> Jatin

Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:

  8276083: Review comments resolved.

-------------

Changes:
  - all: https://git.openjdk.java.net/panama-vector/pull/157/files
  - new: https://git.openjdk.java.net/panama-vector/pull/157/files/6381943e..ea6f0088

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=panama-vector&pr=157&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=panama-vector&pr=157&range=00-01

  Stats: 943 lines in 38 files changed: 0 ins; 943 del; 0 mod
  Patch: https://git.openjdk.java.net/panama-vector/pull/157.diff
  Fetch: git fetch https://git.openjdk.java.net/panama-vector pull/157/head:pull/157

PR: https://git.openjdk.java.net/panama-vector/pull/157


More information about the panama-dev mailing list