RFR: 8291600: [vectorapi] vector cast op check is not always needed for vector mask cast [v2]
Xiaohong Gong
xgong at openjdk.org
Thu Aug 18 06:34:12 UTC 2022
On Thu, 18 Aug 2022 06:25:42 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:
>> Recently we found the performance of "`FIRST_NONZERO`" for double type is largely worse than the other types on x86 when `UseAVX=2`. The main reason is the "`VectorCastL2X`" op is not supported by the backend when the dst element type is `T_DOUBLE`. This makes the check of `VectorCast` op fail before intrinsifying "`VectorMask.cast()`" which is used in the
>> "`FIRST_NONZERO`" java implementation (see [1]). However, the compiler will not generate the `VectorCast `op for `VectorMask.cast()` if:
>>
>> 1) the current platform supports the predicated feature
>> 2) the element size (in bytes) of the src and dst type is the same
>>
>> So the check of "`VectorCast`" op is needless for such cases. To fix it, this patch:
>>
>> 1) limits the specified vector cast op check to vectors
>> 2) adds the relative mask cast op check for VectorMask.cast()
>> 3) cleans up the unnecessary codes
>>
>> Here is the performance of "`FIRST_NONZERO`" benchmark [2] on a x86 machine with `UseAVX=2`:
>>
>> Benchmark (size) Mode Cnt Before After Units
>> DoubleMaxVector.FIRST_NONZERO 1024 thrpt 15 49.266 2460.886 ops/ms
>> DoubleMaxVector.FIRST_NONZEROMasked 1024 thrpt 15 49.554 1892.223 ops/ms
>>
>> [1] https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/DoubleVector.java#L770
>> [2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/DoubleMaxVector.java#L246
>
> Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
>
> - Unify VectorMaskCast for all platforms
> - Merge branch 'master' into JDK-8291600
> - 8291600: [vectorapi] vector cast op check is not always needed for vector mask cast
Hi, the new commit refactors the vector mask cast implementation that we always use the VectorMaskCast op instead of VectorCast op for all platforms. The backend implementation for VectorMaskCast can be cheaper than VectorCast for some architectures like x86 avx. Please take a look at the changes. Thanks a lot!
-------------
PR: https://git.openjdk.org/jdk/pull/9737
More information about the hotspot-compiler-dev
mailing list