RFR: 8287835: Add support for additional float/double to integral conversion for x86 [v5]
Jatin Bhateja
jbhateja at openjdk.java.net
Wed Jun 8 13:25:40 UTC 2022
On Mon, 6 Jun 2022 23:27:23 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
>> Currently the C2 JIT only supports float -> int and double -> long conversion for x86.
>> This PR adds the support for following conversions in the c2 JIT:
>> float -> long, short, byte
>> double -> int, short, byte
>>
>> The performance gain is as follows.
>> Before the patch:
>> Benchmark Mode Cnt Score Error Units
>> VectorFPtoIntCastOperations.microDouble2Byte thrpt 3 32367.971 ± 6161.118 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Int thrpt 3 25825.251 ± 5417.104 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Long thrpt 3 59641.958 ± 17307.177 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Short thrpt 3 29641.505 ± 12023.015 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Byte thrpt 3 16271.224 ± 1523.083 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Int thrpt 3 59199.994 ± 14357.959 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Long thrpt 3 17169.197 ± 1738.273 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Short thrpt 3 14934.139 ± 2329.253 ops/ms
>>
>> After the patch:
>> Benchmark Mode Cnt Score Error Units
>> VectorFPtoIntCastOperations.microDouble2Byte thrpt 3 115436.659 ± 21282.364 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Int thrpt 3 87194.395 ± 9443.106 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Long thrpt 3 59652.356 ± 7240.721 ops/ms
>> VectorFPtoIntCastOperations.microDouble2Short thrpt 3 110570.719 ± 10401.620 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Byte thrpt 3 110028.539 ± 11113.137 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Int thrpt 3 59469.193 ± 18272.495 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Long thrpt 3 59897.101 ± 7249.268 ops/ms
>> VectorFPtoIntCastOperations.microFloat2Short thrpt 3 86167.554 ± 8253.232 ops/ms
>>
>> Please review.
>>
>> Best Regards,
>> Sandhya
>
> Sandhya Viswanathan has updated the pull request incrementally with one additional commit since the last revision:
>
> Fix extra space
src/hotspot/cpu/x86/x86.ad line 1892:
> 1890: // Conversion to long in addition needs avx512dq
> 1891: // Need avx512vl for size_in_bits < 512
> 1892: if (is_integral_type(bt) && (bt != T_INT)) {
Why special check for bt != T_INT
src/hotspot/cpu/x86/x86.ad line 7349:
> 7347: assert(to_elem_bt == T_BYTE, "required");
> 7348: __ evpmovdb($dst$$XMMRegister, $dst$$XMMRegister, vlen_enc);
> 7349: }
We do support F2I cast on AVX2 and that can be extended for sub-word types using
signed saturated lane packing instructions (PACKSSDW and PACKSSWB).
src/hotspot/cpu/x86/x86.ad line 7388:
> 7386: case T_BYTE:
> 7387: __ evpmovsqd($dst$$XMMRegister, $dst$$XMMRegister, vlen_enc);
> 7388: __ evpmovdb($dst$$XMMRegister, $dst$$XMMRegister, vlen_enc);
Sub-word handling can be extended for AVX2 using packing instruction sequence similar to VectorStoreMask for quad ward lanes.
src/hotspot/cpu/x86/x86.ad line 7391:
> 7389: break;
> 7390: default: assert(false, "%s", type2name(to_elem_bt));
> 7391: }
Please move this to a macro assembly routine named vector_castD2X_evex
test/hotspot/jtreg/compiler/vectorapi/VectorFPtoIntCastTest.java line 45:
> 43: private static final int COUNT = 16;
> 44: private static final VectorSpecies<Float> fspec512 = FloatVector.SPECIES_512;
> 45: private static final VectorSpecies<Double> dspec512 = DoubleVector.SPECIES_512;
Unused declarations.
test/micro/org/openjdk/bench/jdk/incubator/vector/VectorFPtoIntCastOperations.java line 59:
> 57: @Benchmark
> 58: public IntVector microFloat2Int() {
> 59: return (IntVector)fvec512.convertShape(VectorOperators.F2I, IntVector.SPECIES_512, 0);
We can remove explicit cast by setting return type to Vector<Integer>
Applicable to all cases.
-------------
PR: https://git.openjdk.java.net/jdk/pull/9032
More information about the hotspot-compiler-dev
mailing list