RFR: 8287835: Add support for additional float/double to integral conversion for x86

Mon Jun 6 20:36:03 UTC 2022

On Sun, 5 Jun 2022 01:41:02 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>> Currently the C2 JIT only supports float -> int and double -> long conversion for x86. 
>> This PR adds the support for following conversions in the c2 JIT:
>>   float -> long, short, byte
>>   double -> int, short, byte
>> 
>> The performance gain is as follows.
>> Before the patch:
>> Benchmark                                       Mode  Cnt      Score       Error   Units
>> VectorFPtoIntCastOperations.microDouble2Byte   thrpt    3  32367.971 ±  6161.118  ops/ms
>> VectorFPtoIntCastOperations.microDouble2Int    thrpt    3  25825.251 ±  5417.104  ops/ms
>> VectorFPtoIntCastOperations.microDouble2Long   thrpt    3  59641.958 ± 17307.177  ops/ms
>> VectorFPtoIntCastOperations.microDouble2Short  thrpt    3  29641.505 ± 12023.015  ops/ms
>> VectorFPtoIntCastOperations.microFloat2Byte    thrpt    3  16271.224 ±  1523.083  ops/ms
>> VectorFPtoIntCastOperations.microFloat2Int     thrpt    3  59199.994 ± 14357.959  ops/ms
>> VectorFPtoIntCastOperations.microFloat2Long    thrpt    3  17169.197 ±  1738.273  ops/ms
>> VectorFPtoIntCastOperations.microFloat2Short   thrpt    3  14934.139 ±  2329.253  ops/ms
>> 
>> After the patch:
>> Benchmark                                       Mode  Cnt       Score       Error   Units
>> VectorFPtoIntCastOperations.microDouble2Byte   thrpt    3  115436.659 ± 21282.364  ops/ms
>> VectorFPtoIntCastOperations.microDouble2Int    thrpt    3   87194.395 ±  9443.106  ops/ms
>> VectorFPtoIntCastOperations.microDouble2Long   thrpt    3   59652.356 ±  7240.721  ops/ms
>> VectorFPtoIntCastOperations.microDouble2Short  thrpt    3  110570.719 ± 10401.620  ops/ms
>> VectorFPtoIntCastOperations.microFloat2Byte    thrpt    3  110028.539 ± 11113.137  ops/ms
>> VectorFPtoIntCastOperations.microFloat2Int     thrpt    3   59469.193 ± 18272.495  ops/ms
>> VectorFPtoIntCastOperations.microFloat2Long    thrpt    3   59897.101 ±  7249.268  ops/ms
>> VectorFPtoIntCastOperations.microFloat2Short   thrpt    3   86167.554 ±  8253.232  ops/ms
>> 
>> Please review.
>> 
>> Best Regards,
>> Sandhya
>
> src/hotspot/cpu/x86/x86.ad line 7298:
> 
>> 7296:   predicate(((VM_Version::supports_avx512vl() ||
>> 7297:               Matcher::vector_length_in_bytes(n) == 64)) &&
>> 7298:              is_integral_type(Matcher::vector_element_basic_type(n)));
> 
> Do we need some of these conditions since you have them already in `match_rule_supported_vector()`?

The predicate is not correct for all types this instruction is used now: it says that if size is 64 bytes you don't need avx512vl support for all types. Is it true?

All this is very confusing. I suggest to keep original `castFtoI_reg_evex()` instruction as it was and use new `castFtoX_reg_evex()` only for T_LONG and sub_integer with new predicate `(type != T_INT)` and additional conditions if needed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/9032