[aarch64-port-dev ] Optimized memcpy() for Cortex

Daniel Stewart daniel.stewart at linaro.org
Tue Aug 8 15:54:58 UTC 2017


I had been trying to look into the copy_memory() routine (which gets used
by both disjoint and conjoint copies) to see if interleaving the loads and
stores (as referenced in the document you mention) obtains better
performance than all the loads followed by all the stores, as is currently
implemented.

However, even when trying to do this only for disjoint copies, I seem to
have an issue with it. It seems any sort of change I make to the
stubGenerator_aarch64.cpp file for JDK9 or 10 results in errors compiling.

Is there some magic a new person to OpenJDK like myself is missing when
trying to modify this file?

Daniel

On Fri, Aug 4, 2017 at 11:29 AM, Andrew Haley <aph at redhat.com> wrote:

> Cortex®-A57/A72 processor manual contains this gem:
>
> -----------------------------------------------------------------
> The Cortex-A57 processor includes separate load and store pipelines,
> which allow it to execute one load μop and one store μop every
> cycle.
>
> The following example shows a recommended instruction sequence for a
> long memory copy in AArch32 state:
>
> Loop_start:
>     SUBS    r2,r2,#64
>     LDRD    r3,r4,[r1,#0]
>     STRD    r3,r4,[r0,#0]
>     LDRD    r3,r4,[r1,#8]
>     STRD    r3,r4,[r0,#8]
>     LDRD    r3,r4,[r1,#16]
>     STRD    r3,r4,[r0,#16]
>     LDRD    r3,r4,[r1,#24]
>     STRD    r3,r4,[r0,#24]
>     LDRD    r3,r4,[r1,#32]
>     STRD    r3,r4,[r0,#32]
>     LDRD    r3,r4,[r1,#40]
>     STRD    r3,r4,[r0,#40]
>     LDRD    r3,r4,[r1,#48]
>     STRD    r3,r4,[r0,#48]
>     LDRD    r3,r4,[r1,#56]
>     STRD    r3,r4,[r0,#56]
>     ADD     r1,r1,#64
>     ADD     r0,r0,#64
>     BGT     Loop_start
>
> A recommended copy routine for AArch64 would look similar to the
> sequence above, but would use LDP/STP instructions.
> -----------------------------------------------------------------
>
> Our copy routines don't do this.  I don't know if it would help.
>
> --
> Andrew Haley
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>



-- 
Daniel Stewart


More information about the aarch64-port-dev mailing list