RFR (XXS) 8234610: MaxVectorSize set wrongly when UseAVX=3 is specified after JDK-8221092

Viswanathan, Sandhya sandhya.viswanathan at intel.com
Mon Nov 25 18:44:48 UTC 2019


Hi Vladimir,

From the source code, it looks like os_supports_avx_vectors() does two different things based on the AVX level. 
If AVX=3, it checks if save/restore of zmm worked properly.
If AVX=1 or 2, it checks is save/restore of ymm worked properly.
Since only save/restore of zmm is done conditionally for Skylake, the problem is only with AVX=3.  And that is what this patch is trying to fix.

The rest we are entering 8221092 discussion. 
I think Vivek's intent there was not do any AVX 512 instructions at all if AVX < 3 to overcome performance regressions observed by Scott Oaks.

Best Regards,
Sandhya

-----Original Message-----
From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com> 
Sent: Monday, November 25, 2019 10:15 AM
To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Vladimir Kozlov <vladimir.kozlov at oracle.com>
Cc: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR (XXS) 8234610: MaxVectorSize set wrongly when UseAVX=3 is specified after JDK-8221092

Thanks for the clarifications, Sandhya!

Some more questions inlined.
> The bug happens like below:
> 
> * User specifies -XX:UseAVX=3 on command line.
> * On Skylake platform due to the following lines:
>      __ lea(rsi, Address(rbp, in_bytes(VM_Version::std_cpuid1_offset())));
>      __ movl(rax, Address(rsi, 0));
>      __ cmpl(rax, 0x50654);              // If it is Skylake
>      __ jcc(Assembler::equal, legacy_setup);
>     The zmm registers are not saved/restored.
> * This results in os_supports_avx_vectors() returning false when called from VM_Version::get_processor_features():
>    int max_vector_size = 0;
>    if (UseSSE < 2) {
>      // Vectors (in XMM) are only supported with SSE2+
>      // SSE is always 2 on x64.
>      max_vector_size = 0;
>    } else if (UseAVX == 0 || !os_supports_avx_vectors()) {
>      // 16 byte vectors (in XMM) are supported with SSE2+
>      max_vector_size = 16;                              ====> This is the point where max_vector_size is set to 16
>    } else else if (UseAVX == 1 || UseAVX == 2) {
>       ...
>    }
> * And so we get UseAVX=3 and max_vector_size = 16.
> 
> So the fix is to save/restore zmm registers when the flag is not default and user specifies UseAVX > 2.

Unfortunately, I'm still confused :-(

If it turns os_supports_avx_vectors() == false, why -XX:UseAVX=1 and
-XX:UseAVX=2 aren't affected on Skylake as well?

> On your question regarding why 8221092 needs the code you conditionally exclude:
> This was introduced so as not to do any AVX512 execution if not required.  ZMM register save/restore uses AVX512 instruction.

Are you talking about completely avoiding execution of AVX512 instructions on Skylakes if UseAVX < 3?

Considering it is in generate_get_cpu_info() which is run only once early at startup, what kind of effects you intend to avoid?

Best regards,
Vladimir Ivanov

> 
> Best Regards,
> Sandhya
> 
> 
> -----Original Message-----
> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
> Sent: Friday, November 22, 2019 4:39 AM
> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; hotspot 
> compiler <hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: RFR (XXS) 8234610: MaxVectorSize set wrongly when 
> UseAVX=3 is specified after JDK-8221092
> 
> Hi Sandhya,
> 
>> On Skylake platform the JVM by default sets the UseAVX level to 2 and accordingly sets MaxVectorSize to 32 bytes as per JDK-8221092<https://bugs.openjdk.java.net/browse/JDK-8221092>.
>>
>> When the user explicitly wants to use AVX3 on Skylake platform, they need to invoke the JVM with -XX:UseAVX=3 command line argument.
>> This should automatically result in MaxVectorSize being set to 64 bytes.
>>
>> However post JDK-8221092<https://bugs.openjdk.java.net/browse/JDK-8221092>, when -XX:UseAVX=3 is given on command line, the MaxVectorSize is being wrongly set to 16 bytes.
> 
> Please, elaborate how it happens and how legacy_setup affects it?
> 
> Why 8221092 needs the code you conditionally exclude.
> Why the following isn't enough?
> 
>      if (FLAG_IS_DEFAULT(UseAVX)) {
>        FLAG_SET_DEFAULT(UseAVX, use_avx_limit);
> +    if (is_intel_family_core() && _model == CPU_MODEL_SKYLAKE &&
> _stepping < 5) {
> +      FLAG_SET_DEFAULT(UseAVX, 2);  //Set UseAVX=2 for Skylake
> +    }
>      } else if (UseAVX > use_avx_limit) {
> 
> Best regards,
> Vladimir Ivanov
> 


More information about the hotspot-compiler-dev mailing list