RFR: 8320379: C2: Sort spilling/unspilling sequence for better ld/st merging into ldp/stp on AArch64

Fei Gao fgao at openjdk.org
Tue Nov 21 07:22:41 UTC 2023


Macro-assembler on aarch64 can merge adjacent loads or stores into ldp/stp.[[1]](https://github.com/openjdk/jdk/blob/a95062b39a431b4937ab6e9e73de4d2b8ea1ac49/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L2079)

For example, it can merge:

str     w20, [sp, #16]
str     w10, [sp, #20]

into

stp     w20, w10, [sp, #16]


But C2 may generate a sequence like:

str     x21, [sp, #8]
str     w20, [sp, #16]
str     x19, [sp, #24] <---
str     w10, [sp, #20] <--- Before sorting
str     x11, [sp, #40]
str     w13, [sp, #48]
str     x16, [sp, #56]

We can't do any merging for non-adjacent loads or stores.

The patch is to sort the spilling or unspilling sequence in the order of offset during instruction scheduling and bundling phase. After that, we can get a new sequence:

str     x21, [sp, #8]
str     w20, [sp, #16]
str     w10, [sp, #20] <---
str     x19, [sp, #24] <--- After sorting
str     x11, [sp, #40]
str     w13, [sp, #48]
str     x16, [sp, #56]


Then macro-assembler can do ld/st merging:

str     x21, [sp, #8]
stp     w20, w10, [sp, #16] <--- Merged
str     x19, [sp, #24]
str     x11, [sp, #40]
str     w13, [sp, #48]
str     x16, [sp, #56]


To justify the patch, we run `HelloWorld.java`

public class HelloWorld {
    public static void main(String [] args) {
        System.out.println("Hello World!");
    }
}

with `java -Xcomp -XX:-TieredCompilation HelloWorld`.

Before the patch, macro-assembler can do ld/st merging for 3688 times. After the patch, the number of ld/st merging increases to 3871 times, by ~5 %.

Tested tier1~3 on x86 and AArch64.

-------------

Commit messages:
 - 8320379: C2: Sort spilling/unspilling sequence for better ld/st merging into ldp/stp on AArch64

Changes: https://git.openjdk.org/jdk/pull/16754/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16754&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8320379
  Stats: 41 lines in 1 file changed: 38 ins; 0 del; 3 mod
  Patch: https://git.openjdk.org/jdk/pull/16754.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16754/head:pull/16754

PR: https://git.openjdk.org/jdk/pull/16754


More information about the hotspot-compiler-dev mailing list