RFR: 8287835: Add support for additional float/double to integral conversion for x86
Vladimir Kozlov
kvn at openjdk.java.net
Mon Jun 6 20:36:03 UTC 2022
On Sun, 5 Jun 2022 01:41:02 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:
>> Currently the C2 JIT only supports float -> int and double -> long conversion for x86.
>> This PR adds the support for following conversions in the c2 JIT:
>> float -> long, short, byte
>> double -> int, short, byte
>>
>> The performance gain is as follows.
>> Before the patch:
>> Benchmark Mode Cnt Score Error Units
>> VectorFPtoIntCastOperations.microDouble2Byte thrpt 3 32367.971 ± 6161.118 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Int thrpt 3 25825.251 ± 5417.104 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Long thrpt 3 59641.958 ± 17307.177 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Short thrpt 3 29641.505 ± 12023.015 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Byte thrpt 3 16271.224 ± 1523.083 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Int thrpt 3 59199.994 ± 14357.959 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Long thrpt 3 17169.197 ± 1738.273 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Short thrpt 3 14934.139 ± 2329.253 ops/ms
>>
>> After the patch:
>> Benchmark Mode Cnt Score Error Units
>> VectorFPtoIntCastOperations.microDouble2Byte thrpt 3 115436.659 ± 21282.364 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Int thrpt 3 87194.395 ± 9443.106 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Long thrpt 3 59652.356 ± 7240.721 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Short thrpt 3 110570.719 ± 10401.620 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Byte thrpt 3 110028.539 ± 11113.137 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Int thrpt 3 59469.193 ± 18272.495 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Long thrpt 3 59897.101 ± 7249.268 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Short thrpt 3 86167.554 ± 8253.232 ops/ms
>>
>> Please review.
>>
>> Best Regards,
>> Sandhya
>
> src/hotspot/cpu/x86/x86.ad line 7298:
>
>> 7296: predicate(((VM_Version::supports_avx512vl() ||
>> 7297: Matcher::vector_length_in_bytes(n) == 64)) &&
>> 7298: is_integral_type(Matcher::vector_element_basic_type(n)));
>
> Do we need some of these conditions since you have them already in `match_rule_supported_vector()`?
The predicate is not correct for all types this instruction is used now: it says that if size is 64 bytes you don't need avx512vl support for all types. Is it true?
All this is very confusing. I suggest to keep original `castFtoI_reg_evex()` instruction as it was and use new `castFtoX_reg_evex()` only for T_LONG and sub_integer with new predicate `(type != T_INT)` and additional conditions if needed.
-------------
PR: https://git.openjdk.java.net/jdk/pull/9032
More information about the hotspot-compiler-dev
mailing list