RFR: 8326541: [AArch64] ZGC C2 load barrier stub considers the length of live registers when spilling registers

Joshua Zhu jzhu at openjdk.org
Fri Feb 23 08:15:13 UTC 2024


Currently ZGC C2 load barrier stub saves the whole live register regardless of what size of register is live on aarch64.
Considering the size of SVE register is an implementation-defined multiple of 128 bits, up to 2048 bits,
even the use of a floating point may cause the maximum 2048 bits stack occupied.
Hence I would like to introduce this change on aarch64: take the length of live registers into consideration in ZGC C2 load barrier stub.

In a floating point case on 2048 bits SVE machine, the following ZLoadBarrierStubC2 


  ......
  0x0000ffff684cfad8:   stp     x15, x18, [sp, #80]
  0x0000ffff684cfadc:   sub     sp, sp, #0x100
  0x0000ffff684cfae0:   str     z16, [sp]
  0x0000ffff684cfae4:   add     x1, x13, #0x10
  0x0000ffff684cfae8:   mov     x0, x16
 ;; 0xFFFF803F5414
  0x0000ffff684cfaec:   mov     x8, #0x5414                     // #21524
  0x0000ffff684cfaf0:   movk    x8, #0x803f, lsl #16
  0x0000ffff684cfaf4:   movk    x8, #0xffff, lsl #32
  0x0000ffff684cfaf8:   blr     x8
  0x0000ffff684cfafc:   mov     x16, x0
  0x0000ffff684cfb00:   ldr     z16, [sp]
  0x0000ffff684cfb04:   add     sp, sp, #0x100
  0x0000ffff684cfb08:   ptrue   p7.b
  0x0000ffff684cfb0c:   ldp     x4, x5, [sp, #16]
  ......


could be optimized into:


  ......  
  0x0000ffff684cfa50:   stp     x15, x18, [sp, #80]
  0x0000ffff684cfa54:   str     d16, [sp, #-16]!                   // extra 8 bytes to align 16 bytes in push_fp()
  0x0000ffff684cfa58:   add     x1, x13, #0x10
  0x0000ffff684cfa5c:   mov     x0, x16
 ;; 0xFFFF7FA942A8
  0x0000ffff684cfa60:   mov     x8, #0x42a8                     // #17064
  0x0000ffff684cfa64:   movk    x8, #0x7fa9, lsl #16
  0x0000ffff684cfa68:   movk    x8, #0xffff, lsl #32
  0x0000ffff684cfa6c:   blr     x8
  0x0000ffff684cfa70:   mov     x16, x0
  0x0000ffff684cfa74:   ldr     d16, [sp], #16
  0x0000ffff684cfa78:   ptrue   p7.b
  0x0000ffff684cfa7c:   ldp     x4, x5, [sp, #16]
  ......


Besides the above benefit, when we know what size of register is live,
we could remove the unnecessary caller save in ZGC C2 load barrier stub when we meet C-ABI SOE fp registers.

Passed jtreg with option "-XX:+UseZGC -XX:+ZGenerational" with no failures introduced.

-------------

Commit messages:
 - 8326541: [AArch64] ZGC C2 load barrier stub considers the length of live registers when spilling registers

Changes: https://git.openjdk.org/jdk/pull/17977/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17977&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8326541
  Stats: 301 lines in 6 files changed: 236 ins; 7 del; 58 mod
  Patch: https://git.openjdk.org/jdk/pull/17977.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/17977/head:pull/17977

PR: https://git.openjdk.org/jdk/pull/17977


More information about the hotspot-dev mailing list