RFR: 8287835: Add support for additional float/double to integral conversion for x86

Sun Jun 5 01:48:32 UTC 2022

On Sat, 4 Jun 2022 22:13:32 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

> Currently the C2 JIT only supports float -> int and double -> long conversion for x86. 
> This PR adds the support for following conversions in the c2 JIT:
>   float -> long, short, byte
>   double -> int, short, byte
> 
> The performance gain is as follows.
> Before the patch:
> Benchmark                                       Mode  Cnt      Score       Error   Units
> VectorFPtoIntCastOperations.microDouble2Byte   thrpt    3  32367.971 ±  6161.118  ops/ms
> VectorFPtoIntCastOperations.microDouble2Int    thrpt    3  25825.251 ±  5417.104  ops/ms
> VectorFPtoIntCastOperations.microDouble2Long   thrpt    3  59641.958 ± 17307.177  ops/ms
> VectorFPtoIntCastOperations.microDouble2Short  thrpt    3  29641.505 ± 12023.015  ops/ms
> VectorFPtoIntCastOperations.microFloat2Byte    thrpt    3  16271.224 ±  1523.083  ops/ms
> VectorFPtoIntCastOperations.microFloat2Int     thrpt    3  59199.994 ± 14357.959  ops/ms
> VectorFPtoIntCastOperations.microFloat2Long    thrpt    3  17169.197 ±  1738.273  ops/ms
> VectorFPtoIntCastOperations.microFloat2Short   thrpt    3  14934.139 ±  2329.253  ops/ms
> 
> After the patch:
> Benchmark                                       Mode  Cnt       Score       Error   Units
> VectorFPtoIntCastOperations.microDouble2Byte   thrpt    3  115436.659 ± 21282.364  ops/ms
> VectorFPtoIntCastOperations.microDouble2Int    thrpt    3   87194.395 ±  9443.106  ops/ms
> VectorFPtoIntCastOperations.microDouble2Long   thrpt    3   59652.356 ±  7240.721  ops/ms
> VectorFPtoIntCastOperations.microDouble2Short  thrpt    3  110570.719 ± 10401.620  ops/ms
> VectorFPtoIntCastOperations.microFloat2Byte    thrpt    3  110028.539 ± 11113.137  ops/ms
> VectorFPtoIntCastOperations.microFloat2Int     thrpt    3   59469.193 ± 18272.495  ops/ms
> VectorFPtoIntCastOperations.microFloat2Long    thrpt    3   59897.101 ±  7249.268  ops/ms
> VectorFPtoIntCastOperations.microFloat2Short   thrpt    3   86167.554 ±  8253.232  ops/ms
> 
> Please review.
> 
> Best Regards,
> Sandhya

I assume it is support for "vector conversion".

Please, add IR framework test.

src/hotspot/cpu/x86/x86.ad line 1877:

> 1875:       if (is_integral_type(bt) && !VM_Version::supports_avx512dq()) {
> 1876:         return false;
> 1877:       }

Overlapping conditions for the same types are confusing.

src/hotspot/cpu/x86/x86.ad line 1889:

> 1887:         return false;
> 1888:       }
> 1889:       if ((bt == T_LONG) && !VM_Version::supports_avx512dq()) {

Again overlapping conditions. So T_LONG requires both: AVX512, avx512vl and avx512dq?

What about T_INT?

src/hotspot/cpu/x86/x86.ad line 7298:

> 7296:   predicate(((VM_Version::supports_avx512vl() ||
> 7297:               Matcher::vector_length_in_bytes(n) == 64)) &&
> 7298:              is_integral_type(Matcher::vector_element_basic_type(n)));

Do we need some of these conditions since you have them already in `match_rule_supported_vector()`?

-------------

PR: https://git.openjdk.java.net/jdk/pull/9032