[15] RFR (M): 8239008: C2: Simplify Replicate support for sub-word types on x86
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Tue Mar 10 17:41:57 UTC 2020
Thanks for the review, Vladimir.
Best regards,
Vladimir Ivanov
On 05.03.2020 03:30, Vladimir Kozlov wrote:
> Thank you for all testing.
>
> I agree with your changes.
>
> Thanks,
> Vladimir K
>
> On 2/18/20 1:08 AM, Vladimir Ivanov wrote:
>>
>>>>> I would like to see test compiler/codegen/Test*Vect.java which
>>>>> verify these instructions in different CPU configurations.
>>>>
>>>> Can you elaborate, please? compiler/codegen/Test*Vect.java and
>>>> compiler/c2/cr6340864/Test*Vect.java tests do exercise Repl* AD
>>>> instructions and I saw them catching bugs while working on the patch.
>>>
>>> For example, you can run Test*Vect.java tests with your changes on
>>> machines which have or don't avx512. Or run with -XX:UseAVX=1,
>>> -XX:UseAVX=2 on AVX512 machine.
>>
>> I ran aforementioned tests on a SKL host in the following modes:
>> * -XX:UseAVX=3
>> * -XX:UseAVX=3, but BW/DQ/VL are disabled (mimics KNL)
>> * -XX:UseAVX=2
>> * -XX:UseAVX=1
>> * -XX:UseAVX=0
>> * -XX:UseAVX=0 -XX:UseSSE=2
>>
>> Test results are clean.
>>
>> Additionally, I explicitly disabled AVX512VL and observed different
>> assertion failures. All of the failures (except one) are not related
>> to the changes being proposed. So, that particular configuration
>> (+F/+BW/+DQ/.../-VL) is already broken. Filed JDK-8239331 [1].
>>
>> Considering there's no such hardware exist (SKL and beyond all have VL
>> and KNL doesn't have BW/DQ), it hasn't been tested before.
>>
>> I propose to proceed with the patch as it is now and do the cleanup
>> (JDK-8239331) later.
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8239331
>>
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>>>
>>>> [1]
>>>> VEX.128.66.0F38.W0 78 /r
>>>> VPBROADCASTB xmm1, xmm2/m8
>>>> AVX2
>>>>
>>>> VEX.256.66.0F38.W0 78 /r
>>>> VPBROADCASTB ymm1, xmm2/m8
>>>> AVX2
>>>>
>>>> EVEX.128.66.0F38.W0 78 /r
>>>> VPBROADCASTB xmm1{k1}{z}, xmm2/m8
>>>> AVX512VL AVX512BW
>>>>
>>>> EVEX.256.66.0F38.W0 78 /r
>>>> VPBROADCASTB ymm1{k1}{z}, xmm2/m8
>>>> AVX512VL AVX512BW
>>>>
>>>> EVEX.512.66.0F38.W0 78 /r
>>>> VPBROADCASTB zmm1{k1}{z}, xmm2/m8
>>>> AVX512BW
>>>>
>>>> [2]
>>>> http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/cpu/x86/assembler_x86.cpp#l7881
>>>>
>>>>
>>>> [3]
>>>> http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/cpu/x86/assembler_x86.cpp#l7078
>>>>
>>>>
>>>>> On 2/13/20 7:55 AM, Vladimir Ivanov wrote:
>>>>>> http://cr.openjdk.java.net/~vlivanov/8239008/webrev.00/
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8239008
>>>>>>
>>>>>> Simplify Replicate support for sub-word types on x86 based on the
>>>>>> following observations:
>>>>>> * 512-bit vectors of sub-word element types are supported only
>>>>>> on AVX512BW-capable hardware [1];
>>>>>> * VBROADCASTS[SD]/VPBROADCAST[BWDQ] are available since AVX/AVX2.
>>>>>>
>>>>>> Also, fixed asserts in VBROADCASTS[SD] according to the manual:
>>>>>> * reg-to-reg variants are part of AVX2 (while mem-to-reg are
>>>>>> introduced in AVX);
>>>>>> * VBROADCASTSD doesn't have 128-bit variant.
>>>>>>
>>>>>> Testing: hs-precheckin-comp,hs-tier1,hs-tier2
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Best regards,
>>>>>> Vladimir Ivanov
>>>>>>
>>>>>> [1]
>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/cpu/x86/x86.ad#l1524
>>>>>>
>>>>>>
>>>>>> const int Matcher::vector_width_in_bytes(BasicType bt) {
>>>>>> ...
>>>>>> if (UseAVX > 2 && (bt == T_BYTE || bt == T_SHORT || bt == T_CHAR))
>>>>>> size = (VM_Version::supports_avx512bw()) ? 64 : 32;
More information about the hotspot-compiler-dev
mailing list