RFR: 8282711: Accelerate Math.signum function for AVX and AVX512 target. [v8]

Thu Apr 14 20:45:19 UTC 2022

On Thu, 14 Apr 2022 20:34:38 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> - Patch auto-vectorizes Math.signum operation for floating point types.
>> - Efficient JIT sequence is being generated for AVX512 and legacy X86 targets.
>> - Following is the performance data for include JMH micro.
>> 
>> System : Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz  (40C 2S Icelake Server) 
>> 
>> Benchmark | (SIZE) | Baseline AVX (ns/op) | Withopt AVX (ns/op) | Gain Ratio | Basline AVX512 (ns/op) | Withopt AVX512 (ns/op) | Gain Ratio
>> -- | -- | -- | -- | -- | -- | -- | --
>> VectorSignum.doubleSignum | 256 | 177.01 | 58.457 | 3.028037703 | 175.46 | 40.996 | 4.279929749
>> VectorSignum.doubleSignum | 512 | 340.244 | 115.162 | 2.954481513 | 340.697 | 78.779 | 4.324718516
>> VectorSignum.doubleSignum | 1024 | 665.628 | 235.584 | 2.82543806 | 668.958 | 157.706 | 4.24180437
>> VectorSignum.doubleSignum | 2048 | 1312.473 | 468.997 | 2.798467794 | 1305.233 | 1295.126 | 1.007803874
>> VectorSignum.floatSignum | 256 | 175.895 | 31.968 | 5.502220971 | 177.95 | 25.438 | 6.995439893
>> VectorSignum.floatSignum | 512 | 341.472 | 59.937 | 5.697182041 | 336.86 | 42.946 | 7.843803847
>> VectorSignum.floatSignum | 1024 | 663.263 | 127.245 | 5.212487721 | 656.554 | 84.945 | 7.729165931
>> VectorSignum.floatSignum | 2048 | 1317.936 | 236.527 | 5.572031946 | 1292.6 | 160.474 | 8.054887396
>> 
>> Kindly review and share feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8282711: VPBLENDMPS has lower latency compared to VPBLENDVPS, reverting predication conditions.

Looks good to me.

-------------

Marked as reviewed by sviswanathan (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7717