RFR: 8370473: C2: Better Aligment of Vector Spill Slots
Martin Doerr
mdoerr at openjdk.org
Fri Nov 7 11:29:05 UTC 2025
On Fri, 24 Oct 2025 07:36:57 GMT, Richard Reingruber <rrich at openjdk.org> wrote:
> With this change c2 will allocate spill slots for vectors with sp offsets aligned to the size of the vectors. Maximum alignment is StackAlignmentInBytes.
>
> It also updates comments that have never been changed to describe how register allocation works for sizes larger than 64 bit.
>
> The change helps to produce better spill code on AARCH64 and PPC64 where an additional add instruction is emitted if the offset of a vector un-/spill is not aligned.
>
> The change is rather a cleanup than an optimization. In most cases the sp offsets will already be properly aligned.
> Only with incoming stack arguments unaligned offsets can be generated. But also then alignment padding is only added if vector registers larger than 64 bit are used.
>
> So the costs are effectively zero. Especially because extra padding won't enlarge the frame since only virtual registers are allocated which are mapped to the caller frame (see `pad0` in the [diagram](https://github.com/openjdk/jdk/blob/92e380c59c2498b1bc94e26658b07b383deae59a/src/hotspot/cpu/aarch64/aarch64.ad#L3829))
>
> There's a risk though that with the extra virtual registers allocated for `pad0` the limit of registers a `RegMask` can represent is reached (occurs with excessive spilling). If this happens the compilation would fail. It could be retried with smaller alignment for vector spilling though. I havn't implemented it as I thought the risk is negligible.
>
> Note that the sp offset of the accesses should be aligned rather than the effective address. So it could even be argued that the maximum alignment could be higher than StackAlignmentInBytes.
>
> ##### Testing with fastdebug builds on AARCH64 and PPC64:
>
> hotspot_vector_1
> hotspot_vector_2
> jdk_vector
> jdk_vector_sanity
>
> ##### The change passed our CI testing:
> Tier 1-4 of hotspot and jdk. All of langtools and jaxp. Renaissance Suite and SAP specific tests.
> Testing was done on the main platforms and also on Linux/PPC64le and AIX.
>
> C2 compilation of `jdk.internal.vm.vector.VectorSupport::rearrangeOp` has unaligned spill offsets. It is covered by the following tests:
>
> compiler/vectorapi/VectorRearrangeTest.java
> jdk/incubator/vector/Byte128VectorLoadStoreTests.java
> jdk/incubator/vector/Double256VectorLoadStoreTests.java
> jdk/incubator/vector/Float128VectorTests.java
> jdk/incubator/vector/Long256VectorLoadStoreTests.java
> jdk/incubator/vector/Short128VectorLoadStoreTests.java
> jdk/incubator/vector/Vector64ConversionTests.java
This looks like a nice improvement! Thanks for doing it! I only have minor comments.
src/hotspot/cpu/ppc/ppc.ad line 1800:
> 1798: int src_offset = ra_->reg2offset(src_lo);
> 1799: int dst_offset = ra_->reg2offset(dst_lo);
> 1800: DEBUG_ONLY(int algm = MIN2(RegMask::num_registers(ideal_reg()), (int)Matcher::stack_alignment_in_slots()) * VMRegImpl::stack_slot_size);
This must always be 16, but ok. (Instructions can't encode other offsets.) You can keep it this way.
src/hotspot/cpu/ppc/ppc.ad line 1839:
> 1837: } else {
> 1838: st->print("%-7s %s, R1_SP, %d \t// vector spill copy", "ADDI", Matcher::regName[src_lo], dst_offset);
> 1839: st->print("%-7s %s, [R1_SP] \t// vector spill copy", "STXVD2X", Matcher::regName[src_lo]);
The output looks wrong. We write to [R0], not [R1_SP]. Better use one line?
src/hotspot/cpu/ppc/ppc.ad line 1865:
> 1863: } else {
> 1864: st->print("%-7s %s, R1_SP, %d \t// vector spill copy", "ADDI", Matcher::regName[src_lo], src_offset);
> 1865: st->print("%-7s %s, [R1_SP] \t// vector spill copy", "LXVD2X", Matcher::regName[dst_lo]);
As above.
src/hotspot/share/opto/chaitin.hpp line 146:
> 144: private:
> 145: // Number of registers this live range uses when it colors
> 146: uint16_t _num_regs; // byte size of the value divided by 4
Maybe "divided by slot size which is 4"?
-------------
PR Review: https://git.openjdk.org/jdk/pull/27969#pullrequestreview-3433146180
PR Review Comment: https://git.openjdk.org/jdk/pull/27969#discussion_r2502878330
PR Review Comment: https://git.openjdk.org/jdk/pull/27969#discussion_r2502917723
PR Review Comment: https://git.openjdk.org/jdk/pull/27969#discussion_r2502920655
PR Review Comment: https://git.openjdk.org/jdk/pull/27969#discussion_r2502925601
More information about the hotspot-compiler-dev
mailing list