[aarch64-port-dev ] Caller registers protection inside loop hurts performance
Andrew Haley
aph at redhat.com
Thu Aug 16 10:14:52 UTC 2018
On 08/16/2018 03:25 AM, Patrick Zhang wrote:
> About fewer-registers-protected, here is the assembly code diff, aarch64 has 5 load + 5 store, while x86 (same code base) has half. The loop count is 4096 by default, managed by string density bench.
Sure, but don't make too many assumptions on the basis of one
sample. If the compiler is running low on registers then it'll
spill. In gneral youre better off with more registers, but you can get
lucky.
--
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the aarch64-port-dev
mailing list