RFR: 8345067: C2: enable implicit null checks for ZGC reads [v2]
Quan Anh Mai
qamai at openjdk.org
Tue May 13 16:27:00 UTC 2025
On Tue, 13 May 2025 16:03:43 GMT, Roberto Castañeda Lozano <rcastanedalo at openjdk.org> wrote:
>> Currently, C2 cannot exploit late-expanded GC memory accesses as implicit null checks because of their use of temporary operands (`MachTemp`), which prevents `PhaseCFG::implicit_null_check` from [hoisting the memory accesses to the test basic block](https://github.com/openjdk/jdk/blob/f88c1c6ff86b8f29a71647e46136b6432bb67619/src/hotspot/share/opto/lcm.cpp#L319-L335).
>>
>> This changeset extends the scope of the implicit null check optimization so that it can exploit ZGC object loads. It introduces a platform-dependent predicate (`MachNode::is_late_expanded_null_check_candidate`) to mark late-expanded instructions that emit a suitable memory access as a first instruction as candidates, and extends the optimization to recognize and hoist candidate memory accesses that use temporary operands:
>>
>> 
>>
>> ZGC object loads are marked as late-expanded null-check candidates unconditionally on all ZGC-supported platforms except on aarch64, where only loads that do not require an initial `lea` instruction (due to [address legitimization](https://github.com/openjdk/jdk/blob/ddd07b107e814ec846579a66d4f2005b7db9bb2f/src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp#L132-L144)) are marked as candidates. Fortunately, most aarch64 loads seen in practice use small offsets and can be marked as candidates.
>>
>> Exploiting ZGC loads increases the effectiveness of the implicit null check optimization (percent of explicit null checks turned into implicit ones at compile time) by around 10% in the DaCapo23 benchmarks. This results in slight performance improvements (in the 1-2% range) in a few DaCapo and SPECjvm2008 benchmarks and an overall slight improvement across Renaissance benchmarks.
>>
>> #### Testing
>> - tier1-5, compiler stress test (linux-x64, macosx-x64, windows-x64, linux-aarch64, macosx-aarch64; release and debug mode).
>
> Roberto Castañeda Lozano has updated the pull request incrementally with nine additional commits since the last revision:
>
> - Generalize tests by removing requires annotation and adding local applyIf rules
> - Assert that we do not move control nodes
> - Extend comment about hoisting DecodeN inputs
> - Apply Emanuels suggestions to ensure_node_is_at_block_or_above
> - Rename auxiliary functions
> - Rename auxiliary functions
> - Clarify scope of move_into
> - Extend comment about MachTemp nodes
> - Extract and reuse legitimize_address test
src/hotspot/share/opto/output.cpp line 2020:
> 2018: assert(access->barrier_data() == 0 ||
> 2019: access->is_late_expanded_null_check_candidate(),
> 2020: "Implicit null checks on memory accesses with barriers are only supported on nodes explicitly marked as null-check candidates");
I assume this is why you want the SIGSEGV instruction to be the first one. Do you think it is better if we mark the whole region and any SIGSEGV from any instruction inside the region will be mapped to this handler. Another way is to make the `MachNode` set the SIGSEGV point themselves.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/25066#discussion_r2087211380
More information about the hotspot-gc-dev
mailing list