RFR: 8342601: AArch64: Micro-optimize bit shift in copy_memory

Sat Oct 19 00:03:36 UTC 2024

On Fri, 18 Oct 2024 18:35:02 GMT, Chad Rakoczy <duke at openjdk.org> wrote:

> [JDK-8342601](https://bugs.openjdk.org/browse/JDK-8342601)
> 
> Fix minor inefficiency in `copy_memory` by adding check before doing bit shift to see if we are able to do a move instruction instead. Change is low risk because of the low complexity of the change
> 
> Ran array copy and tier 1 on aarch64 machine
> 
> Test summary
> ==============================
>    TEST                                              TOTAL  PASS  FAIL ERROR   
>    jtreg:test/hotspot/jtreg/compiler/arraycopy          49    49     0     0   
> ==============================
> 
> ==============================
> Test summary
> ==============================
>    TEST                                              TOTAL  PASS  FAIL ERROR   
>    jtreg:test/hotspot/jtreg:tier1                     2591  2591     0     0   
>    jtreg:test/jdk:tier1                               2436  2436     0     0   
>    jtreg:test/langtools:tier1                         4577  4577     0     0   
>    jtreg:test/jaxp:tier1                                 0     0     0     0   
>    jtreg:test/lib-test:tier1                            34    34     0     0   
> ==============================

A general principle for our work here:  Keep it simple.  Simplicity is maintainability.  Only make it more complex if you can demonstrate a practical win.  Otherwise, keep the hands away from the keyboard.

It probably never ever makes a difference, to replace shift-by-0 with move, likewise an add-with-0 or an xor-with-0.  On most RISC machines for the last 4 decades, mov is just a macro for such an ALU operation, and all of them take the same number of cycles and go through the same circuitry (with different mode lines enabled).

In other words, the cost model justifying this supposed improvement is probably about a half century out of date.  Maybe an expert on AARCH64 can correct me on this point?

So I'm against this change as unnecessary, unless there is a performance test that shows a significant benefit.  Changes like this only make the sources harder to read and maintain.

If we think multiple people will have an overwhelming urge to tweak the code, even after the tweak is proven unnecessary, then add a comment to future maintainers explaining why there is no need here for a tweak.  A semi-useless comment is easier to maintain than a useless if/then/else.

Dean's suggestion is somewhat reasonable, to have the assembler desugar shift-by-zero to mov.  The only effect of that would be to make the disassembly slightly easier to read, for less experienced readers of disassembly.  We did some things like that with the SPARC port, but did so under new assembler API points (what used to be called pseudo-instructions).  For example, we had a mov-or-nop instruction, which nopped itself if the source and destination of the move were the same register.  There is huge benefit to having the assembler API do exactly what it says, and not second-guess the user (swapping other instructions behind the user's back).  Using that process, you'd introduce a shift-or-mov instruction, explicitly. Which is obviously not worth it. So, again, I'd say no change is needed here, not even Dean's.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/21589#issuecomment-2423387296