RFR: 8345067: C2: enable implicit null checks for ZGC reads [v2]

Roberto Castañeda Lozano rcastanedalo at openjdk.org
Tue May 13 16:03:43 UTC 2025


> Currently, C2 cannot exploit late-expanded GC memory accesses as implicit null checks because of their use of temporary operands (`MachTemp`), which prevents `PhaseCFG::implicit_null_check` from [hoisting the memory accesses to the test basic block](https://github.com/openjdk/jdk/blob/f88c1c6ff86b8f29a71647e46136b6432bb67619/src/hotspot/share/opto/lcm.cpp#L319-L335).
> 
> This changeset extends the scope of the implicit null check optimization so that it can exploit ZGC object loads. It introduces a platform-dependent predicate (`MachNode::is_late_expanded_null_check_candidate`) to mark late-expanded instructions that emit a suitable memory access as a first instruction as candidates, and extends the optimization to recognize and hoist candidate memory accesses that use temporary operands:
> 
> ![example](https://github.com/user-attachments/assets/b5f9bbc8-d75d-4cf3-841e-73db3dbae753)
> 
> ZGC object loads are marked as late-expanded null-check candidates unconditionally on all ZGC-supported platforms except on aarch64, where only loads that do not require an initial `lea` instruction (due to [address legitimization](https://github.com/openjdk/jdk/blob/ddd07b107e814ec846579a66d4f2005b7db9bb2f/src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp#L132-L144)) are marked as candidates. Fortunately, most aarch64 loads seen in practice use small offsets and can be marked as candidates.
> 
> Exploiting ZGC loads increases the effectiveness of the implicit null check optimization (percent of explicit null checks turned into implicit ones at compile time) by around 10% in the DaCapo23 benchmarks. This results in slight performance improvements (in the 1-2% range) in a few DaCapo and SPECjvm2008 benchmarks and an overall slight improvement across Renaissance benchmarks.
> 
> #### Testing
> - tier1-5, compiler stress test (linux-x64, macosx-x64, windows-x64, linux-aarch64, macosx-aarch64; release and debug mode).

Roberto Castañeda Lozano has updated the pull request incrementally with nine additional commits since the last revision:

 - Generalize tests by removing requires annotation and adding local applyIf rules
 - Assert that we do not move control nodes
 - Extend comment about hoisting DecodeN inputs
 - Apply Emanuels suggestions to ensure_node_is_at_block_or_above
 - Rename auxiliary functions
 - Rename auxiliary functions
 - Clarify scope of move_into
 - Extend comment about MachTemp nodes
 - Extract and reuse legitimize_address test

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/25066/files
  - new: https://git.openjdk.org/jdk/pull/25066/files/dc5aa4fc..6353f42b

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=25066&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25066&range=00-01

  Stats: 66 lines in 5 files changed: 21 ins; 19 del; 26 mod
  Patch: https://git.openjdk.org/jdk/pull/25066.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/25066/head:pull/25066

PR: https://git.openjdk.org/jdk/pull/25066


More information about the hotspot-gc-dev mailing list