RFR: 8301869: Regression ~14% in J2dBench-bimg_misc-* in 21-b5 only on linux-aarch64 [v2]

Tue Feb 28 20:48:08 UTC 2023

On Tue, 28 Feb 2023 13:57:21 GMT, Erik Joelsson <erikj at openjdk.org> wrote:

>> Thanks @prrace for your inputs.
>> I dont think -fPIC is implied by -fpic. @erikj79  please clarify.
>> 
>> Difference between -fpic and -fPIC:
>> 1) With -fPIC we have unlimited storage capacity for global offset table and with -fpic on some platforms we have limitation on this size. For aarch64 it is 28k, but in case of x86 size of GOT is unlimited irrespective of -fPIC or -fpic. This is already captured.
>> 2) Also it looks like if we use -fpic option instead of -fPIC we may generate smaller/faster code as mentioned at http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html. Looks like in our case, 28k GOT size suffices and we are able to create smaller/faster code for libawt in case of aarch64. This inturn looks to be related to usage of -msmall-data or -mlarge-data flags. -fpic implies -msmall-data and -fPIC implies -mlarge-data while generating code for shared libraries. -msmall-data also talks about data being nearer and easily accessible compared to  -mlarge-data(https://man7.org/linux/man-pages/man1/gcc.1.html)
>> 
>> Also i can make this change specific to aarch64 since we are seeing its effect only on aarch64.
>
> I think you have read more about the differences between -fpic and -fPIC than I have, so nothing I can add here.
> 
> Making this option only on aarch64 seems like a reasonable idea if that's where we see the effect.

> I dont think -fPIC is implied by -fpic. @erikj79 please clarify.

That isn't what I said.
I said they they do very similar things, based on the references I read
with just a specific difference is that lower case "fpic" can be limited in size.

which is why I  wrote
>  So far as I can tell the -fpic are using just limits the size of the global offset table that -fPIC will generate.

My comment
> Since we had an unlimited offset table before, then -fpic won't change anything.

I was trying to point out that if your eval + fix is explained by saying we no longer
have "-fpic" then that's wrong. We never had -fpic.
We didn't change compiler options, but clearly performance changed

The man page you just linked does explain how the mechanism -fpic uses
which limits size could result in better performance

" When generating code for shared libraries, -fpic implies
           -msmall-data and -fPIC implies -mlarge-data."

"When -msmall-data is used,
           objects 8 bytes long or smaller are placed in a small data
           area (the ".sdata" and ".sbss" sections) and are accessed via
           16-bit relocations off of the $gp register."

So on some architectures it can be faster.

BUT since we were using -fPIC for a long time, I am not seeing how  you've
explained WHY we now need to explicitly use -fpic to get the performance back ?
In other words "we mysteriously lost performance, but I've found an option that helps".

So possibly, if you applied -fpic to the JDK before the regression it would get even faster ...

-------------

PR: https://git.openjdk.org/jdk/pull/12761