Integrated: 8256488: [aarch64] Use ldpq/stpq instead of ld4/st4 for small copies in StubGenerator::copy_memory

Evgeny Astigeevich github.com+42899633+eastig at openjdk.java.net
Thu Nov 26 16:13:58 UTC 2020


On Wed, 18 Nov 2020 14:10:48 GMT, Evgeny Astigeevich <github.com+42899633+eastig at openjdk.org> wrote:

> This patch fixes 27%-48% performance regressions of small arraycopies on Graviton2 (Neoverse N1) when UseSIMDForMemoryOps is enabled. For such copies ldpq/stpq are used instead of ld4/st4.
> This follows what the Arm Optimization Guide, including for Neoverse N1, recommends: Use discrete, non-writeback forms of load and store instructions while interleaving them.
> 
> The patch passed jtreg tier1-2 and all gtest tests with linux-aarch64-server-release build and UseSIMDForMemoryOps enabled.

This pull request has now been integrated.

Changeset: 6e006223
Author:    Evgeny Astigeevich <eastig at amazon.com>
Committer: Volker Simonis <simonis at openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/6e006223
Stats:     4 lines in 1 file changed: 2 ins; 0 del; 2 mod

8256488: [aarch64] Use ldpq/stpq instead of ld4/st4 for small copies in StubGenerator::copy_memory

Reviewed-by: simonis

-------------

PR: https://git.openjdk.java.net/jdk/pull/1293


More information about the hotspot-compiler-dev mailing list