RFR: 8287835: Add support for additional float/double to integral conversion for x86
Sandhya Viswanathan
sviswanathan at openjdk.java.net
Mon Jun 6 14:36:44 UTC 2022
On Sun, 5 Jun 2022 01:42:40 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:
>> Currently the C2 JIT only supports float -> int and double -> long conversion for x86.
>> This PR adds the support for following conversions in the c2 JIT:
>> float -> long, short, byte
>> double -> int, short, byte
>>
>> The performance gain is as follows.
>> Before the patch:
>> Benchmark Mode Cnt Score Error Units
>> VectorFPtoIntCastOperations.microDouble2Byte thrpt 3 32367.971 ± 6161.118 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Int thrpt 3 25825.251 ± 5417.104 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Long thrpt 3 59641.958 ± 17307.177 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Short thrpt 3 29641.505 ± 12023.015 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Byte thrpt 3 16271.224 ± 1523.083 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Int thrpt 3 59199.994 ± 14357.959 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Long thrpt 3 17169.197 ± 1738.273 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Short thrpt 3 14934.139 ± 2329.253 ops/ms
>>
>> After the patch:
>> Benchmark Mode Cnt Score Error Units
>> VectorFPtoIntCastOperations.microDouble2Byte thrpt 3 115436.659 ± 21282.364 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Int thrpt 3 87194.395 ± 9443.106 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Long thrpt 3 59652.356 ± 7240.721 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Short thrpt 3 110570.719 ± 10401.620 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Byte thrpt 3 110028.539 ± 11113.137 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Int thrpt 3 59469.193 ± 18272.495 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Long thrpt 3 59897.101 ± 7249.268 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Short thrpt 3 86167.554 ± 8253.232 ops/ms
>>
>> Please review.
>>
>> Best Regards,
>> Sandhya
>
> src/hotspot/cpu/x86/x86.ad line 1877:
>
>> 1875: if (is_integral_type(bt) && !VM_Version::supports_avx512dq()) {
>> 1876: return false;
>> 1877: }
>
> Overlapping conditions for the same types are confusing.
I will add comments and rephrase the checks to make it clearer.
> src/hotspot/cpu/x86/x86.ad line 1889:
>
>> 1887: return false;
>> 1888: }
>> 1889: if ((bt == T_LONG) && !VM_Version::supports_avx512dq()) {
>
> Again overlapping conditions. So T_LONG requires both: AVX512, avx512vl and avx512dq?
>
> What about T_INT?
T_INT doesn't need AVX512dq. Float to long conversion (T_LONG) uses evcvttps2qq, which needs AVX512dq.
-------------
PR: https://git.openjdk.java.net/jdk/pull/9032
More information about the hotspot-compiler-dev
mailing list