RFR (XXS) 8234610: MaxVectorSize set wrongly when UseAVX=3 is specified after JDK-8221092
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Tue Nov 26 16:20:48 UTC 2019
Pushed [1]
Best regards,
Vladimir Ivanov
[1] http://hg.openjdk.java.net/jdk/jdk/rev/dff8053bdb74
On 26.11.2019 19:08, Viswanathan, Sandhya wrote:
> Hi Vladimir,
>
> Please do sponsor the patch. I am not a committer.
>
> Best Regards,
> Sandhya
>
>
> -----Original Message-----
> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
> Sent: Tuesday, November 26, 2019 1:26 AM
> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Vladimir Kozlov <vladimir.kozlov at oracle.com>
> Cc: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: RFR (XXS) 8234610: MaxVectorSize set wrongly when UseAVX=3 is specified after JDK-8221092
>
>
>> http://cr.openjdk.java.net/~sviswanathan/8234610/webrev.02/
>
> Looks good.
>
> Testing results (hs-precheckin-comp,hs-tier1,hs-tier2) are clean.
>
> Best regards,
> Vladimir Ivanov
>
>> -----Original Message-----
>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>> Sent: Monday, November 25, 2019 12:23 PM
>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Vladimir
>> Kozlov <vladimir.kozlov at oracle.com>
>> Cc: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
>> Subject: Re: RFR (XXS) 8234610: MaxVectorSize set wrongly when
>> UseAVX=3 is specified after JDK-8221092
>>
>>
>>> From the source code, it looks like os_supports_avx_vectors() does two different things based on the AVX level.
>>> If AVX=3, it checks if save/restore of zmm worked properly.
>>> If AVX=1 or 2, it checks is save/restore of ymm worked properly.
>>> Since only save/restore of zmm is done conditionally for Skylake, the problem is only with AVX=3. And that is what this patch is trying to fix.
>>
>> Ok, now I see how it works:
>>
>> os_supports_avx_vectors() uses supports_evex() and supports_avx() to dispatch which inspect supported CPU features:
>>
>> static bool supports_avx() { return (_features & CPU_AVX) != 0; }
>>
>> static bool supports_evex() { return (_features & CPU_AVX512F) !=
>> 0; }
>>
>> But VM_Version::get_processor_features() clears detected features depending on UseAVX level.
>>
>> So, as you described os_supports_avx_vectors() goes through different paths between UseAVX=3 and UseAVX=1/2.
>>
>>> The rest we are entering 8221092 discussion.
>>> I think Vivek's intent there was not do any AVX 512 instructions at all if AVX < 3 to overcome performance regressions observed by Scott Oaks.
>>
>> My best guess is it helped analysis by eliminating all problematic instructions.
>>
>> Anyway, I'm fine with leaving that code. (Though it would be nice to
>> simply get rid of it.)
>>
>> It looks like all the code after start_simd_check (and checks to
>> legacy_save_restore) can go under use_evex check. And it'll make the code clearer (I spent too much time reasoning about interactions between use_evex and FLAG_IS_DEFAULT(UseAVX) you added).
>>
>> Otherwise, looks good.
>>
>> Best regards,
>> Vladimir Ivanov
>>
>>> -----Original Message-----
>>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>>> Sent: Monday, November 25, 2019 10:15 AM
>>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Vladimir
>>> Kozlov <vladimir.kozlov at oracle.com>
>>> Cc: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
>>> Subject: Re: RFR (XXS) 8234610: MaxVectorSize set wrongly when
>>> UseAVX=3 is specified after JDK-8221092
>>>
>>> Thanks for the clarifications, Sandhya!
>>>
>>> Some more questions inlined.
>>>> The bug happens like below:
>>>>
>>>> * User specifies -XX:UseAVX=3 on command line.
>>>> * On Skylake platform due to the following lines:
>>>> __ lea(rsi, Address(rbp, in_bytes(VM_Version::std_cpuid1_offset())));
>>>> __ movl(rax, Address(rsi, 0));
>>>> __ cmpl(rax, 0x50654); // If it is Skylake
>>>> __ jcc(Assembler::equal, legacy_setup);
>>>> The zmm registers are not saved/restored.
>>>> * This results in os_supports_avx_vectors() returning false when called from VM_Version::get_processor_features():
>>>> int max_vector_size = 0;
>>>> if (UseSSE < 2) {
>>>> // Vectors (in XMM) are only supported with SSE2+
>>>> // SSE is always 2 on x64.
>>>> max_vector_size = 0;
>>>> } else if (UseAVX == 0 || !os_supports_avx_vectors()) {
>>>> // 16 byte vectors (in XMM) are supported with SSE2+
>>>> max_vector_size = 16; ====> This is the point where max_vector_size is set to 16
>>>> } else else if (UseAVX == 1 || UseAVX == 2) {
>>>> ...
>>>> }
>>>> * And so we get UseAVX=3 and max_vector_size = 16.
>>>>
>>>> So the fix is to save/restore zmm registers when the flag is not default and user specifies UseAVX > 2.
>>>
>>> Unfortunately, I'm still confused :-(
>>>
>>> If it turns os_supports_avx_vectors() == false, why -XX:UseAVX=1 and
>>> -XX:UseAVX=2 aren't affected on Skylake as well?
>>>
>>>> On your question regarding why 8221092 needs the code you conditionally exclude:
>>>> This was introduced so as not to do any AVX512 execution if not required. ZMM register save/restore uses AVX512 instruction.
>>>
>>> Are you talking about completely avoiding execution of AVX512 instructions on Skylakes if UseAVX < 3?
>>>
>>> Considering it is in generate_get_cpu_info() which is run only once early at startup, what kind of effects you intend to avoid?
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>>>
>>>> Best Regards,
>>>> Sandhya
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>>>> Sent: Friday, November 22, 2019 4:39 AM
>>>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; hotspot
>>>> compiler <hotspot-compiler-dev at openjdk.java.net>
>>>> Subject: Re: RFR (XXS) 8234610: MaxVectorSize set wrongly when
>>>> UseAVX=3 is specified after JDK-8221092
>>>>
>>>> Hi Sandhya,
>>>>
>>>>> On Skylake platform the JVM by default sets the UseAVX level to 2 and accordingly sets MaxVectorSize to 32 bytes as per JDK-8221092<https://bugs.openjdk.java.net/browse/JDK-8221092>.
>>>>>
>>>>> When the user explicitly wants to use AVX3 on Skylake platform, they need to invoke the JVM with -XX:UseAVX=3 command line argument.
>>>>> This should automatically result in MaxVectorSize being set to 64 bytes.
>>>>>
>>>>> However post JDK-8221092<https://bugs.openjdk.java.net/browse/JDK-8221092>, when -XX:UseAVX=3 is given on command line, the MaxVectorSize is being wrongly set to 16 bytes.
>>>>
>>>> Please, elaborate how it happens and how legacy_setup affects it?
>>>>
>>>> Why 8221092 needs the code you conditionally exclude.
>>>> Why the following isn't enough?
>>>>
>>>> if (FLAG_IS_DEFAULT(UseAVX)) {
>>>> FLAG_SET_DEFAULT(UseAVX, use_avx_limit);
>>>> + if (is_intel_family_core() && _model == CPU_MODEL_SKYLAKE &&
>>>> _stepping < 5) {
>>>> + FLAG_SET_DEFAULT(UseAVX, 2); //Set UseAVX=2 for Skylake
>>>> + }
>>>> } else if (UseAVX > use_avx_limit) {
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>>>
More information about the hotspot-compiler-dev
mailing list