[aarch64-port-dev ] Caller registers protection inside loop hurts performance

Andrew Haley aph at redhat.com
Thu Aug 16 10:14:52 UTC 2018


On 08/16/2018 03:25 AM, Patrick Zhang wrote:
> About fewer-registers-protected, here is the assembly code diff, aarch64 has 5 load + 5 store, while x86 (same code base) has half. The loop count is 4096 by default, managed by string density bench.

Sure, but don't make too many assumptions on the basis of one
sample. If the compiler is running low on registers then it'll
spill. In gneral youre better off with more registers, but you can get
lucky.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


More information about the aarch64-port-dev mailing list