Upgrading gcc arch ?
Thomas Stüfe
thomas.stuefe at gmail.com
Fri Oct 20 17:12:49 UTC 2017
On Fri, Oct 20, 2017 at 1:37 PM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
> On 2017-10-20 12:56, Thomas Stüfe wrote:
>
>>
>>
>> On Fri, Oct 20, 2017 at 12:11 PM, Robbin Ehn <robbin.ehn at oracle.com
>> <mailto:robbin.ehn at oracle.com>> wrote:
>>
>> On 2017-10-20 11:19, David Holmes wrote:
>>
>> bcc'ing the discuss list
>>
>> On 20/10/2017 6:19 PM, Laurent Bourgès wrote:
>>
>> Hi,
>>
>> I wonder if it is time to compile c/c++ code with a more
>> recent cpu
>> architecture (x86-64 is quite old: only SSE ?) to take
>> benefit of
>> performance optimizations offered by recent CPU and compilers
>> (AVX...).
>>
>>
>> The focus in hotspot is on JIT generated code which does take
>> advantage of such optimizations based on the runtime CPU capabilities.
>>
>> Is there specific C code in the JDK that you think would benefit
>> from them?
>>
>>
>> If there are specific code that preform much better with new some
>> newer features we could utilize function multiversioning feature in the gcc.
>> E.g.:
>> __attribute__((target_clones("sse4.2","sse3","default")))
>> void stream_function(...) {
>> ry
>> Negative impact on size, so as David says, benchmark first.
>>
>>
>> But how would this help with gcc specific optimizations ? You can
>> provide your own implementation, but I thought the idea was to let gcc do
>> the optimization work via -mtune. We still would have one global mtune
>> setting for the compilation unit, right?
>>
>
> target_clones attribute "is used to specify that a function be cloned into
> multiple versions compiled with different target options than specified on
> the command line."
> gcc generates, in above, 3 functions, you can also do:
> __attribute__((target_clones("arch=znver1","arch=skylake", "default")))
>
> So you get:
> [rehn at rehn-lt ~]$ nm a.out | grep stream_function
> 00000000004009e0 T stream_function
> 0000000000400bc0 t stream_function.arch_skylake.1
> 0000000000400b90 t stream_function.arch_znver1.0
> 0000000000400bf0 i stream_function.ifunc
> 0000000000400bf0 W stream_function.resolver
>
> /Robbin
>
>
Very interesting, thanks for the pointer. I did not know that was possible.
Best Regards, Thomas
>
>> ..Thomas
>>
>> /Robbin
>>
>>
>>
>> Have you done comparison builds and run any benchmarks?
>>
>> Thanks,
>> David
>>
>> Of course that means such builds would be specific to a CPU
>> class and that
>> will require build changes to make multiple flavors depending
>> on the CPU
>> classes ...
>>
>> See gcc -mtune argument:
>> https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html
>> <https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html>
>>
>>
>> "
>> ‘sandybridge’
>> Intel Sandy Bridge CPU with 64-bit extensions, MMX,
>> SSE, SSE2, SSE3,
>> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES and PCLMUL
>> instruction set support.
>> ‘ivybridge’
>> Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE,
>> SSE2, SSE3,
>> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES, PCLMUL, FSGSBASE,
>> RDRND and F16C
>> instruction set support.
>> ‘haswell’
>> Intel Haswell CPU with 64-bit extensions, MOVBE, MMX,
>> SSE, SSE2, SSE3,
>> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
>> FSGSBASE, RDRND,
>> FMA, BMI, BMI2 and F16C instruction set support.
>> ‘broadwell’
>> Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX,
>> SSE, SSE2,
>> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
>> FSGSBASE,
>> RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX and PREFETCHW
>> instruction set
>> support.
>> ‘skylake’
>> Intel Skylake CPU with 64-bit extensions, MOVBE, MMX,
>> SSE, SSE2, SSE3,
>> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
>> FSGSBASE, RDRND,
>> FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT,
>> XSAVEC and
>> XSAVES instruction set support.
>> ‘bonnell’
>> Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX,
>> SSE, SSE2, SSE3
>> and SSSE3 instruction set support.
>> ‘silvermont’
>> Intel Silvermont CPU with 64-bit extensions, MOVBE,
>> MMX, SSE, SSE2,
>> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PCLMUL and RDRND
>> instruction set
>> support.
>> ‘knl’
>> Intel Knight's Landing CPU with 64-bit extensions,
>> MOVBE, MMX, SSE,
>> SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES,
>> PCLMUL,
>> FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX,
>> PREFETCHW, AVX512F,
>> AVX512PF, AVX512ER and AVX512CD instruction set support.
>> ‘skylake-avx512’
>> Intel Skylake Server CPU with 64-bit extensions, MOVBE,
>> MMX, SSE, SSE2,
>> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES,
>> PCLMUL, FSGSBASE,
>> RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW,
>> CLFLUSHOPT, XSAVEC,
>> XSAVES, AVX512F, AVX512VL, AVX512BW, AVX512DQ and AVX512CD
>> instruction set
>> support.
>>
>> "
>>
>> Comments are welcome,
>> Laurent
>>
>>
>>
More information about the hotspot-dev
mailing list