Upgrading gcc arch ?

Robbin Ehn robbin.ehn at oracle.com
Fri Oct 20 11:37:50 UTC 2017


On 2017-10-20 12:56, Thomas Stüfe wrote:
> 
> 
> On Fri, Oct 20, 2017 at 12:11 PM, Robbin Ehn <robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>> wrote:
> 
>     On 2017-10-20 11:19, David Holmes wrote:
> 
>         bcc'ing the discuss list
> 
>         On 20/10/2017 6:19 PM, Laurent Bourgès wrote:
> 
>             Hi,
> 
>             I wonder if it is time to compile c/c++ code with a more recent cpu
>             architecture (x86-64 is quite old: only SSE ?) to take benefit of
>             performance optimizations offered by recent CPU and compilers (AVX...).
> 
> 
>         The focus in hotspot is on JIT generated code which does take advantage of such optimizations based on the runtime CPU capabilities.
> 
>         Is there specific C code in the JDK that you think would benefit from them?
> 
> 
>     If there are specific code that preform much better with new some newer features we could utilize function multiversioning feature in the gcc.
>     E.g.:
>     __attribute__((target_clones("sse4.2","sse3","default")))
>     void stream_function(...) {
>     ry
>     Negative impact on size, so as David says, benchmark first.
> 
> 
> But how would this help with gcc specific optimizations ?  You can provide your own implementation, but I thought the idea was to let gcc do the optimization work via -mtune. We still would have one global mtune setting for the compilation unit, right?

target_clones attribute "is used to specify that a function be cloned into multiple versions compiled with different target options than specified on the command line."
gcc generates, in above, 3 functions, you can also do:
__attribute__((target_clones("arch=znver1","arch=skylake", "default")))

So you get:
[rehn at rehn-lt ~]$ nm a.out  | grep stream_function
00000000004009e0 T stream_function
0000000000400bc0 t stream_function.arch_skylake.1
0000000000400b90 t stream_function.arch_znver1.0
0000000000400bf0 i stream_function.ifunc
0000000000400bf0 W stream_function.resolver

/Robbin

> 
> ..Thomas
> 
>     /Robbin
> 
> 
> 
>         Have you done comparison builds and run any benchmarks?
> 
>         Thanks,
>         David
> 
>             Of course that means such builds would be specific to a CPU class and that
>             will require build changes to make multiple flavors depending on the CPU
>             classes ...
> 
>             See gcc -mtune argument:
>             https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html <https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html>
> 
>             "
>             ‘sandybridge’
>                   Intel Sandy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES and PCLMUL instruction set support.
>             ‘ivybridge’
>                   Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES, PCLMUL, FSGSBASE, RDRND and F16C
>             instruction set support.
>             ‘haswell’
>                   Intel Haswell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
>             FMA, BMI, BMI2 and F16C instruction set support.
>             ‘broadwell’
>                   Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
>             SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE,
>             RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX and PREFETCHW instruction set
>             support.
>             ‘skylake’
>                   Intel Skylake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
>             FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC and
>             XSAVES instruction set support.
>             ‘bonnell’
>                   Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3
>             and SSSE3 instruction set support.
>             ‘silvermont’
>                   Intel Silvermont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
>             SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PCLMUL and RDRND instruction set
>             support.
>             ‘knl’
>                   Intel Knight's Landing CPU with 64-bit extensions, MOVBE, MMX, SSE,
>             SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
>             FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, AVX512F,
>             AVX512PF, AVX512ER and AVX512CD instruction set support.
>             ‘skylake-avx512’
>                   Intel Skylake Server CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
>             SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL, FSGSBASE,
>             RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC,
>             XSAVES, AVX512F, AVX512VL, AVX512BW, AVX512DQ and AVX512CD instruction set
>             support.
> 
>             "
> 
>             Comments are welcome,
>             Laurent
> 
> 


More information about the hotspot-dev mailing list