RFR: Newer AMD 17h (EPYC) Processor family defaults

Rohit Arul Raj rohitarulraj at gmail.com
Fri Sep 1 09:34:52 UTC 2017


On Fri, Sep 1, 2017 at 12:49 PM, Vladimir Kozlov
<vladimir.kozlov at oracle.com> wrote:
> On 8/31/17 10:14 PM, Rohit Arul Raj wrote:
>>
>> Hello Vladimir,
>>
>>> But it also mean that AMD will have to do Java testing for this new
>>> platform
>>> and be responsible for it.
>>
>>
>> Can you please elaborate on this a little more?
>> What all Java test suites would you like us to test from our end?
>
>
> First, I am talking only about testing on your platform. In this case it is
> AMD 17h.
>
> You need to build and use fastdebug JVM for testing: configure
> --with-debug-level=fastdebug
>
> You need to make sure to run hotspot and jdk jtreg tests. At least next set
> of tests:
>
> make test JOBS=1 TEST_JOBS=1 TEST="hotspot_compiler hotspot_gc
> hotspot_runtime hotspot_serviceability hotspot_misc jdk_util jdk_lang"
>
> It will take time. You can try to increase JOBS=1 and TEST_JOBS=1 numbers to
> run tests in parallel but depending on memory and swap sizes it may not
> work.

Yes, We will do that.

> In addition to that would be nice if you track performance changes with
> specjvm2008 and specjbb2015 on your cpu to avoid regression when you apply
> new changes or pull changes from OpenJDK.

We do run SPECjbb2015.
Regarding SPECjvm2008, we tried the base run but the results are
pretty inconsistent. The base throughput varies from run to run
(~30%).

This is the command we use to generate the numbers
(startup.compiler.sunflow & compiler.sunflow have been disabled). Is
there any benchmark option we may be missing?

java -jar SPECjvm2008.jar  startup.helloworld
startup.compiler.compiler  startup.compress startup.crypto.aes
startup.crypto.rsa startup.crypto.signverify startup.mpegaudio
startup.scimark.fft startup.scimark.lu startup.scimark.monte_carlo
startup.scimark.sor  startup.scimark.sparse startup.serial
startup.sunflow startup.xml.transform  startup.xml.validation
compiler.compiler compress crypto.aes crypto.rsa  crypto.signverify
derby  mpegaudio scimark.fft.large scimark.lu.large scimark.sor.large
scimark.sparse.large  scimark.fft.small scimark.lu.small
scimark.sor.small scimark.sparse.small scimark.monte_carlo serial
sunflow xml.transform xml.validation

Regards,
Rohit

> If you have questions, please ask.
>
>>
>>> In a future we may forward this CPU related problems to you to analyze
>>> and
>>> fix.
>>
>>
>> Sure, looking forward to it.
>
>
> Best regards,
> Vladimir
>
>
>>
>> Regards,
>> Rohit
>>
>>> Regards,
>>> Vladimir
>>>
>>>
>>> On 8/31/17 2:31 PM, David Holmes wrote:
>>>>
>>>>
>>>> Hi Rohit,
>>>>
>>>> I think the patch needs updating for jdk10 as I already see a lot of
>>>> logic
>>>> around UseSHA in vm_version_x86.cpp.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>> On 1/09/2017 1:11 AM, Rohit Arul Raj wrote:
>>>>>
>>>>>
>>>>> On Thu, Aug 31, 2017 at 5:59 PM, David Holmes <david.holmes at oracle.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> Hi Rohit,
>>>>>>
>>>>>> On 31/08/2017 7:03 PM, Rohit Arul Raj wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I would like an volunteer to review this patch (openJDK9) which sets
>>>>>>> flag/ISA defaults for newer AMD 17h (EPYC) processor and help us with
>>>>>>> the commit process.
>>>>>>>
>>>>>>> Webrev:
>>>>>>>
>>>>>>>
>>>>>>> https://www.dropbox.com/sh/08bsxaxupg8kbam/AADurTXLGIZ6C-tiIAi_Glyka?dl=0
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Unfortunately patches can not be accepted from systems outside the
>>>>>> OpenJDK
>>>>>> infrastructure and ...
>>>>>>
>>>>>>> I have also attached the patch (hg diff -g) for reference.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ... unfortunately patches tend to get stripped by the mail servers. If
>>>>>> the
>>>>>> patch is small please include it inline. Otherwise you will need to
>>>>>> find
>>>>>> an
>>>>>> OpenJDK Author who can host it for you on cr.openjdk.java.net.
>>>>>>
>>>>>
>>>>>>> 3) I have done regression testing using jtreg ($make default) and
>>>>>>> didnt find any regressions.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Sounds good, but until I see the patch it is hard to comment on
>>>>>> testing
>>>>>> requirements.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>
>>>>>
>>>>>
>>>>> Thanks David,
>>>>> Yes, it's a small patch.
>>>>>
>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>> b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>> --- a/src/cpu/x86/vm/vm_version_x86.cpp
>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.cpp
>>>>> @@ -1051,6 +1051,22 @@
>>>>>          }
>>>>>          FLAG_SET_DEFAULT(UseSSE42Intrinsics, false);
>>>>>        }
>>>>> +    if (supports_sha()) {
>>>>> +      if (FLAG_IS_DEFAULT(UseSHA)) {
>>>>> +        FLAG_SET_DEFAULT(UseSHA, true);
>>>>> +      }
>>>>> +    } else if (UseSHA || UseSHA1Intrinsics || UseSHA256Intrinsics ||
>>>>> UseSHA512Intrinsics) {
>>>>> +      if (!FLAG_IS_DEFAULT(UseSHA) ||
>>>>> +          !FLAG_IS_DEFAULT(UseSHA1Intrinsics) ||
>>>>> +          !FLAG_IS_DEFAULT(UseSHA256Intrinsics) ||
>>>>> +          !FLAG_IS_DEFAULT(UseSHA512Intrinsics)) {
>>>>> +        warning("SHA instructions are not available on this CPU");
>>>>> +      }
>>>>> +      FLAG_SET_DEFAULT(UseSHA, false);
>>>>> +      FLAG_SET_DEFAULT(UseSHA1Intrinsics, false);
>>>>> +      FLAG_SET_DEFAULT(UseSHA256Intrinsics, false);
>>>>> +      FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>> +    }
>>>>>
>>>>>        // some defaults for AMD family 15h
>>>>>        if ( cpu_family() == 0x15 ) {
>>>>> @@ -1072,11 +1088,43 @@
>>>>>        }
>>>>>
>>>>>    #ifdef COMPILER2
>>>>> -    if (MaxVectorSize > 16) {
>>>>> -      // Limit vectors size to 16 bytes on current AMD cpus.
>>>>> +    if (cpu_family() < 0x17 && MaxVectorSize > 16) {
>>>>> +      // Limit vectors size to 16 bytes on AMD cpus < 17h.
>>>>>          FLAG_SET_DEFAULT(MaxVectorSize, 16);
>>>>>        }
>>>>>    #endif // COMPILER2
>>>>> +
>>>>> +    // Some defaults for AMD family 17h
>>>>> +    if ( cpu_family() == 0x17 ) {
>>>>> +      // On family 17h processors use XMM and UnalignedLoadStores for
>>>>> Array Copy
>>>>> +      if (supports_sse2() && FLAG_IS_DEFAULT(UseXMMForArrayCopy)) {
>>>>> +        UseXMMForArrayCopy = true;
>>>>> +      }
>>>>> +      if (supports_sse2() && FLAG_IS_DEFAULT(UseUnalignedLoadStores))
>>>>> {
>>>>> +        UseUnalignedLoadStores = true;
>>>>> +      }
>>>>> +      if (supports_bmi2() && FLAG_IS_DEFAULT(UseBMI2Instructions)) {
>>>>> +        UseBMI2Instructions = true;
>>>>> +      }
>>>>> +      if (MaxVectorSize > 32) {
>>>>> +        FLAG_SET_DEFAULT(MaxVectorSize, 32);
>>>>> +      }
>>>>> +      if (UseSHA) {
>>>>> +        if (FLAG_IS_DEFAULT(UseSHA512Intrinsics)) {
>>>>> +          FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>> +        } else if (UseSHA512Intrinsics) {
>>>>> +          warning("Intrinsics for SHA-384 and SHA-512 crypto hash
>>>>> functions not available on this CPU.");
>>>>> +          FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
>>>>> +        }
>>>>> +      }
>>>>> +#ifdef COMPILER2
>>>>> +      if (supports_sse4_2()) {
>>>>> +        if (FLAG_IS_DEFAULT(UseFPUForSpilling)) {
>>>>> +          FLAG_SET_DEFAULT(UseFPUForSpilling, true);
>>>>> +        }
>>>>> +      }
>>>>> +#endif
>>>>> +    }
>>>>>      }
>>>>>
>>>>>      if( is_intel() ) { // Intel cpus specific settings
>>>>> diff --git a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>> b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>> --- a/src/cpu/x86/vm/vm_version_x86.hpp
>>>>> +++ b/src/cpu/x86/vm/vm_version_x86.hpp
>>>>> @@ -513,6 +513,16 @@
>>>>>            result |= CPU_LZCNT;
>>>>>          if (_cpuid_info.ext_cpuid1_ecx.bits.sse4a != 0)
>>>>>            result |= CPU_SSE4A;
>>>>> +      if(_cpuid_info.sef_cpuid7_ebx.bits.bmi2 != 0)
>>>>> +        result |= CPU_BMI2;
>>>>> +      if(_cpuid_info.std_cpuid1_edx.bits.ht != 0)
>>>>> +        result |= CPU_HT;
>>>>> +      if(_cpuid_info.sef_cpuid7_ebx.bits.adx != 0)
>>>>> +        result |= CPU_ADX;
>>>>> +      if (_cpuid_info.sef_cpuid7_ebx.bits.sha != 0)
>>>>> +        result |= CPU_SHA;
>>>>> +      if (_cpuid_info.std_cpuid1_ecx.bits.fma != 0)
>>>>> +        result |= CPU_FMA;
>>>>>        }
>>>>>        // Intel features.
>>>>>        if(is_intel()) {
>>>>>
>>>>> Regards,
>>>>> Rohit
>>>>>
>>>
>


More information about the hotspot-dev mailing list