Upgrading gcc arch ?

Thomas Stüfe thomas.stuefe at gmail.com
Fri Oct 20 17:12:49 UTC 2017


On Fri, Oct 20, 2017 at 1:37 PM, Robbin Ehn <robbin.ehn at oracle.com> wrote:

> On 2017-10-20 12:56, Thomas Stüfe wrote:
>
>>
>>
>> On Fri, Oct 20, 2017 at 12:11 PM, Robbin Ehn <robbin.ehn at oracle.com
>> <mailto:robbin.ehn at oracle.com>> wrote:
>>
>>     On 2017-10-20 11:19, David Holmes wrote:
>>
>>         bcc'ing the discuss list
>>
>>         On 20/10/2017 6:19 PM, Laurent Bourgès wrote:
>>
>>             Hi,
>>
>>             I wonder if it is time to compile c/c++ code with a more
>> recent cpu
>>             architecture (x86-64 is quite old: only SSE ?) to take
>> benefit of
>>             performance optimizations offered by recent CPU and compilers
>> (AVX...).
>>
>>
>>         The focus in hotspot is on JIT generated code which does take
>> advantage of such optimizations based on the runtime CPU capabilities.
>>
>>         Is there specific C code in the JDK that you think would benefit
>> from them?
>>
>>
>>     If there are specific code that preform much better with new some
>> newer features we could utilize function multiversioning feature in the gcc.
>>     E.g.:
>>     __attribute__((target_clones("sse4.2","sse3","default")))
>>     void stream_function(...) {
>>     ry
>>     Negative impact on size, so as David says, benchmark first.
>>
>>
>> But how would this help with gcc specific optimizations ?  You can
>> provide your own implementation, but I thought the idea was to let gcc do
>> the optimization work via -mtune. We still would have one global mtune
>> setting for the compilation unit, right?
>>
>
> target_clones attribute "is used to specify that a function be cloned into
> multiple versions compiled with different target options than specified on
> the command line."
> gcc generates, in above, 3 functions, you can also do:
> __attribute__((target_clones("arch=znver1","arch=skylake", "default")))
>
> So you get:
> [rehn at rehn-lt ~]$ nm a.out  | grep stream_function
> 00000000004009e0 T stream_function
> 0000000000400bc0 t stream_function.arch_skylake.1
> 0000000000400b90 t stream_function.arch_znver1.0
> 0000000000400bf0 i stream_function.ifunc
> 0000000000400bf0 W stream_function.resolver
>
> /Robbin
>
>
Very interesting, thanks for the pointer. I did not know that was possible.

Best Regards, Thomas


>
>> ..Thomas
>>
>>     /Robbin
>>
>>
>>
>>         Have you done comparison builds and run any benchmarks?
>>
>>         Thanks,
>>         David
>>
>>             Of course that means such builds would be specific to a CPU
>> class and that
>>             will require build changes to make multiple flavors depending
>> on the CPU
>>             classes ...
>>
>>             See gcc -mtune argument:
>>             https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html
>> <https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/x86-Options.html>
>>
>>
>>             "
>>             ‘sandybridge’
>>                   Intel Sandy Bridge CPU with 64-bit extensions, MMX,
>> SSE, SSE2, SSE3,
>>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES and PCLMUL
>> instruction set support.
>>             ‘ivybridge’
>>                   Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE,
>> SSE2, SSE3,
>>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES, PCLMUL, FSGSBASE,
>> RDRND and F16C
>>             instruction set support.
>>             ‘haswell’
>>                   Intel Haswell CPU with 64-bit extensions, MOVBE, MMX,
>> SSE, SSE2, SSE3,
>>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
>> FSGSBASE, RDRND,
>>             FMA, BMI, BMI2 and F16C instruction set support.
>>             ‘broadwell’
>>                   Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX,
>> SSE, SSE2,
>>             SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
>> FSGSBASE,
>>             RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX and PREFETCHW
>> instruction set
>>             support.
>>             ‘skylake’
>>                   Intel Skylake CPU with 64-bit extensions, MOVBE, MMX,
>> SSE, SSE2, SSE3,
>>             SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
>> FSGSBASE, RDRND,
>>             FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT,
>> XSAVEC and
>>             XSAVES instruction set support.
>>             ‘bonnell’
>>                   Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX,
>> SSE, SSE2, SSE3
>>             and SSSE3 instruction set support.
>>             ‘silvermont’
>>                   Intel Silvermont CPU with 64-bit extensions, MOVBE,
>> MMX, SSE, SSE2,
>>             SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PCLMUL and RDRND
>> instruction set
>>             support.
>>             ‘knl’
>>                   Intel Knight's Landing CPU with 64-bit extensions,
>> MOVBE, MMX, SSE,
>>             SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES,
>> PCLMUL,
>>             FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX,
>> PREFETCHW, AVX512F,
>>             AVX512PF, AVX512ER and AVX512CD instruction set support.
>>             ‘skylake-avx512’
>>                   Intel Skylake Server CPU with 64-bit extensions, MOVBE,
>> MMX, SSE, SSE2,
>>             SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES,
>> PCLMUL, FSGSBASE,
>>             RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW,
>> CLFLUSHOPT, XSAVEC,
>>             XSAVES, AVX512F, AVX512VL, AVX512BW, AVX512DQ and AVX512CD
>> instruction set
>>             support.
>>
>>             "
>>
>>             Comments are welcome,
>>             Laurent
>>
>>
>>


More information about the hotspot-dev mailing list