RFR (XXS) 8234610: MaxVectorSize set wrongly when UseAVX=3 is specified after JDK-8221092

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Tue Nov 26 16:20:48 UTC 2019


Pushed [1]

Best regards,
Vladimir Ivanov

[1] http://hg.openjdk.java.net/jdk/jdk/rev/dff8053bdb74

On 26.11.2019 19:08, Viswanathan, Sandhya wrote:
> Hi Vladimir,
> 
> Please do sponsor the patch. I am not a committer.
> 
> Best Regards,
> Sandhya
> 
> 
> -----Original Message-----
> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
> Sent: Tuesday, November 26, 2019 1:26 AM
> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Vladimir Kozlov <vladimir.kozlov at oracle.com>
> Cc: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: RFR (XXS) 8234610: MaxVectorSize set wrongly when UseAVX=3 is specified after JDK-8221092
> 
> 
>> http://cr.openjdk.java.net/~sviswanathan/8234610/webrev.02/
> 
> Looks good.
> 
> Testing results (hs-precheckin-comp,hs-tier1,hs-tier2) are clean.
> 
> Best regards,
> Vladimir Ivanov
> 
>> -----Original Message-----
>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>> Sent: Monday, November 25, 2019 12:23 PM
>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Vladimir
>> Kozlov <vladimir.kozlov at oracle.com>
>> Cc: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
>> Subject: Re: RFR (XXS) 8234610: MaxVectorSize set wrongly when
>> UseAVX=3 is specified after JDK-8221092
>>
>>
>>>    From the source code, it looks like os_supports_avx_vectors() does two different things based on the AVX level.
>>> If AVX=3, it checks if save/restore of zmm worked properly.
>>> If AVX=1 or 2, it checks is save/restore of ymm worked properly.
>>> Since only save/restore of zmm is done conditionally for Skylake, the problem is only with AVX=3.  And that is what this patch is trying to fix.
>>
>> Ok, now I see how it works:
>>
>> os_supports_avx_vectors() uses supports_evex() and supports_avx() to dispatch which inspect supported CPU features:
>>
>>      static bool supports_avx()      { return (_features & CPU_AVX) != 0; }
>>
>>      static bool supports_evex()     { return (_features & CPU_AVX512F) !=
>> 0; }
>>
>> But VM_Version::get_processor_features() clears detected features depending on UseAVX level.
>>
>> So, as you described os_supports_avx_vectors() goes through different paths between UseAVX=3 and UseAVX=1/2.
>>
>>> The rest we are entering 8221092 discussion.
>>> I think Vivek's intent there was not do any AVX 512 instructions at all if AVX < 3 to overcome performance regressions observed by Scott Oaks.
>>
>> My best guess is it helped analysis by eliminating all problematic instructions.
>>
>> Anyway, I'm fine with leaving that code. (Though it would be nice to
>> simply get rid of it.)
>>
>> It looks like all the code after start_simd_check (and checks to
>> legacy_save_restore) can go under use_evex check. And it'll make the code clearer (I spent too much time reasoning about interactions between use_evex and FLAG_IS_DEFAULT(UseAVX) you added).
>>
>> Otherwise, looks good.
>>
>> Best regards,
>> Vladimir Ivanov
>>
>>> -----Original Message-----
>>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>>> Sent: Monday, November 25, 2019 10:15 AM
>>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Vladimir
>>> Kozlov <vladimir.kozlov at oracle.com>
>>> Cc: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
>>> Subject: Re: RFR (XXS) 8234610: MaxVectorSize set wrongly when
>>> UseAVX=3 is specified after JDK-8221092
>>>
>>> Thanks for the clarifications, Sandhya!
>>>
>>> Some more questions inlined.
>>>> The bug happens like below:
>>>>
>>>> * User specifies -XX:UseAVX=3 on command line.
>>>> * On Skylake platform due to the following lines:
>>>>         __ lea(rsi, Address(rbp, in_bytes(VM_Version::std_cpuid1_offset())));
>>>>         __ movl(rax, Address(rsi, 0));
>>>>         __ cmpl(rax, 0x50654);              // If it is Skylake
>>>>         __ jcc(Assembler::equal, legacy_setup);
>>>>        The zmm registers are not saved/restored.
>>>> * This results in os_supports_avx_vectors() returning false when called from VM_Version::get_processor_features():
>>>>       int max_vector_size = 0;
>>>>       if (UseSSE < 2) {
>>>>         // Vectors (in XMM) are only supported with SSE2+
>>>>         // SSE is always 2 on x64.
>>>>         max_vector_size = 0;
>>>>       } else if (UseAVX == 0 || !os_supports_avx_vectors()) {
>>>>         // 16 byte vectors (in XMM) are supported with SSE2+
>>>>         max_vector_size = 16;                              ====> This is the point where max_vector_size is set to 16
>>>>       } else else if (UseAVX == 1 || UseAVX == 2) {
>>>>          ...
>>>>       }
>>>> * And so we get UseAVX=3 and max_vector_size = 16.
>>>>
>>>> So the fix is to save/restore zmm registers when the flag is not default and user specifies UseAVX > 2.
>>>
>>> Unfortunately, I'm still confused :-(
>>>
>>> If it turns os_supports_avx_vectors() == false, why -XX:UseAVX=1 and
>>> -XX:UseAVX=2 aren't affected on Skylake as well?
>>>
>>>> On your question regarding why 8221092 needs the code you conditionally exclude:
>>>> This was introduced so as not to do any AVX512 execution if not required.  ZMM register save/restore uses AVX512 instruction.
>>>
>>> Are you talking about completely avoiding execution of AVX512 instructions on Skylakes if UseAVX < 3?
>>>
>>> Considering it is in generate_get_cpu_info() which is run only once early at startup, what kind of effects you intend to avoid?
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>>>
>>>> Best Regards,
>>>> Sandhya
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>>>> Sent: Friday, November 22, 2019 4:39 AM
>>>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; hotspot
>>>> compiler <hotspot-compiler-dev at openjdk.java.net>
>>>> Subject: Re: RFR (XXS) 8234610: MaxVectorSize set wrongly when
>>>> UseAVX=3 is specified after JDK-8221092
>>>>
>>>> Hi Sandhya,
>>>>
>>>>> On Skylake platform the JVM by default sets the UseAVX level to 2 and accordingly sets MaxVectorSize to 32 bytes as per JDK-8221092<https://bugs.openjdk.java.net/browse/JDK-8221092>.
>>>>>
>>>>> When the user explicitly wants to use AVX3 on Skylake platform, they need to invoke the JVM with -XX:UseAVX=3 command line argument.
>>>>> This should automatically result in MaxVectorSize being set to 64 bytes.
>>>>>
>>>>> However post JDK-8221092<https://bugs.openjdk.java.net/browse/JDK-8221092>, when -XX:UseAVX=3 is given on command line, the MaxVectorSize is being wrongly set to 16 bytes.
>>>>
>>>> Please, elaborate how it happens and how legacy_setup affects it?
>>>>
>>>> Why 8221092 needs the code you conditionally exclude.
>>>> Why the following isn't enough?
>>>>
>>>>         if (FLAG_IS_DEFAULT(UseAVX)) {
>>>>           FLAG_SET_DEFAULT(UseAVX, use_avx_limit);
>>>> +    if (is_intel_family_core() && _model == CPU_MODEL_SKYLAKE &&
>>>> _stepping < 5) {
>>>> +      FLAG_SET_DEFAULT(UseAVX, 2);  //Set UseAVX=2 for Skylake
>>>> +    }
>>>>         } else if (UseAVX > use_avx_limit) {
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>>>


More information about the hotspot-compiler-dev mailing list