Upgrading gcc arch ?
Robbin Ehn
robbin.ehn at oracle.com
Fri Oct 20 11:37:50 UTC 2017
On 2017-10-20 12:56, Thomas Stüfe wrote:
>
>
> On Fri, Oct 20, 2017 at 12:11 PM, Robbin Ehn <robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>> wrote:
>
> On 2017-10-20 11:19, David Holmes wrote:
>
> bcc'ing the discuss list
>
> On 20/10/2017 6:19 PM, Laurent Bourgès wrote:
>
> Hi,
>
> I wonder if it is time to compile c/c++ code with a more recent cpu
> architecture (x86-64 is quite old: only SSE ?) to take benefit of
> performance optimizations offered by recent CPU and compilers (AVX...).
>
>
> The focus in hotspot is on JIT generated code which does take advantage of such optimizations based on the runtime CPU capabilities.
>
> Is there specific C code in the JDK that you think would benefit from them?
>
>
> If there are specific code that preform much better with new some newer features we could utilize function multiversioning feature in the gcc.
> E.g.:
> __attribute__((target_clones("sse4.2","sse3","default")))
> void stream_function(...) {
> ry
> Negative impact on size, so as David says, benchmark first.
>
>
> But how would this help with gcc specific optimizations ? You can provide your own implementation, but I thought the idea was to let gcc do the optimization work via -mtune. We still would have one global mtune setting for the compilation unit, right?
target_clones attribute "is used to specify that a function be cloned into multiple versions compiled with different target options than specified on the command line."
gcc generates, in above, 3 functions, you can also do:
__attribute__((target_clones("arch=znver1","arch=skylake", "default")))
So you get:
[rehn at rehn-lt ~]$ nm a.out | grep stream_function
00000000004009e0 T stream_function
0000000000400bc0 t stream_function.arch_skylake.1
0000000000400b90 t stream_function.arch_znver1.0
0000000000400bf0 i stream_function.ifunc
0000000000400bf0 W stream_function.resolver
/Robbin
>
> ..Thomas
>
> /Robbin
>
>
>
> Have you done comparison builds and run any benchmarks?
>
> Thanks,
> David
>
> Of course that means such builds would be specific to a CPU class and that
> will require build changes to make multiple flavors depending on the CPU
> classes ...
>
> See gcc -mtune argument:
> https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html <https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html>
>
> "
> ‘sandybridge’
> Intel Sandy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES and PCLMUL instruction set support.
> ‘ivybridge’
> Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES, PCLMUL, FSGSBASE, RDRND and F16C
> instruction set support.
> ‘haswell’
> Intel Haswell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
> FMA, BMI, BMI2 and F16C instruction set support.
> ‘broadwell’
> Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE,
> RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX and PREFETCHW instruction set
> support.
> ‘skylake’
> Intel Skylake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
> SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND,
> FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC and
> XSAVES instruction set support.
> ‘bonnell’
> Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3
> and SSSE3 instruction set support.
> ‘silvermont’
> Intel Silvermont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PCLMUL and RDRND instruction set
> support.
> ‘knl’
> Intel Knight's Landing CPU with 64-bit extensions, MOVBE, MMX, SSE,
> SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
> FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, AVX512F,
> AVX512PF, AVX512ER and AVX512CD instruction set support.
> ‘skylake-avx512’
> Intel Skylake Server CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
> SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL, FSGSBASE,
> RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC,
> XSAVES, AVX512F, AVX512VL, AVX512BW, AVX512DQ and AVX512CD instruction set
> support.
>
> "
>
> Comments are welcome,
> Laurent
>
>
More information about the hotspot-dev
mailing list