RFR: 8345067: C2: enable implicit null checks for ZGC reads [v5]
Martin Doerr
mdoerr at openjdk.org
Fri May 16 09:36:12 UTC 2025
On Fri, 16 May 2025 07:44:53 GMT, Roberto Castañeda Lozano <rcastanedalo at openjdk.org> wrote:
>> Currently, C2 cannot exploit late-expanded GC memory accesses as implicit null checks because of their use of temporary operands (`MachTemp`), which prevents `PhaseCFG::implicit_null_check` from [hoisting the memory accesses to the test basic block](https://github.com/openjdk/jdk/blob/f88c1c6ff86b8f29a71647e46136b6432bb67619/src/hotspot/share/opto/lcm.cpp#L319-L335).
>>
>> This changeset extends the scope of the implicit null check optimization so that it can exploit ZGC object loads. It introduces a platform-dependent predicate (`MachNode::is_late_expanded_null_check_candidate`) to mark late-expanded instructions that emit a suitable memory access as a first instruction as candidates, and extends the optimization to recognize and hoist candidate memory accesses that use temporary operands:
>>
>> 
>>
>> ZGC object loads are marked as late-expanded null-check candidates unconditionally on all ZGC-supported platforms except on aarch64, where only loads that do not require an initial `lea` instruction (due to [address legitimization](https://github.com/openjdk/jdk/blob/ddd07b107e814ec846579a66d4f2005b7db9bb2f/src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp#L132-L144)) are marked as candidates. Fortunately, most aarch64 loads seen in practice use small offsets and can be marked as candidates.
>>
>> Exploiting ZGC loads increases the effectiveness of the implicit null check optimization (percent of explicit null checks turned into implicit ones at compile time) by around 10% in the DaCapo23 benchmarks. This results in slight performance improvements (in the 1-2% range) in a few DaCapo and SPECjvm2008 benchmarks and an overall slight improvement across Renaissance benchmarks.
>>
>> #### Testing
>> - tier1-5, compiler stress test (linux-x64, macosx-x64, windows-x64, linux-aarch64, macosx-aarch64; release and debug mode).
>
> Roberto Castañeda Lozano has updated the pull request incrementally with one additional commit since the last revision:
>
> Replace control type with PhaseCFG::is_CFG test
Thanks for implementing it and thanks for the ping. It basically works on PPC64, but one IR rule is failing:
Failed IR Rules (1) of Methods (1)
----------------------------------
1) Method "static java.lang.Object compiler.gcbarriers.TestImplicitNullChecks.testLoadVolatile(compiler.gcbarriers.TestImplicitNullChecks$OuterWithVolatileField)" - [Failed IR rules: 1]:
* @IR rule 1: "@compiler.lib.ir_framework.IR(phase={FINAL_CODE}, applyIfPlatformAnd={}, applyIfCPUFeatureOr={}, counts={"_#NULL_CHECK#_", "1"}, applyIfPlatformOr={}, applyIfPlatform={"aarch64", "false"}, failOn={}, applyIfOr={"UseZGC", "true", "UseG1GC", "true"}, applyIfCPUFeatureAnd={}, applyIf={}, applyIfCPUFeature={}, applyIfAnd={}, applyIfNot={})"
> Phase "Final Code":
- counts: Graph contains wrong number of nodes:
* Constraint 1: "(\d+(\s){2}(NullCheck.*)+(\s){2}===.*)"
- Failed comparison: [found] 0 = 1 [given]
- No nodes matched!
This is probably because PPC64 uses a membar_volatile before volatile load, so the graph looks differently:
33 Prolog === [[ ]] [2380000000033]
9 MachProj === 10 [[ 8 ]] #0/unmatched !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:-1 (line 85)
R3 11 MachProj === 10 [[ 8 26 ]] #5 Oop:compiler/gcbarriers/TestImplicitNullChecks$OuterWithVolatileField * !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:-1 (line 85)
12 MachProj === 10 [[ 4 17 ]] #1/unmatched !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:-1 (line 85)
13 MachProj === 10 [[ 4 21 ]] #2/unmatched Memory: @BotPTR *+bot, idx=Bot; !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:-1 (line 85)
R1 14 MachProj === 10 [[ 4 2 17 ]] #3 !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:-1 (line 85)
15 MachProj === 10 [[ 4 17 ]] #4 !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:-1 (line 85)
0 Con === 10 [[ ]] #top
8 zeroCheckP_reg_imm0 === 9 11 [[ 7 22 ]] P=0.000001, C=-1.000000 !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:1 (line 85)
BB#002:
31 Region === 31 22 [[ 31 21 26 ]]
21 membar_volatile === 31 0 13 0 0 [[ 20 23 ]] !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:1 (line 85)
20 MachProj === 21 [[ 19 ]] #0/unmatched !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:1 (line 85)
23 MachProj === 21 [[ 19 26 ]] #2/unmatched Memory: @BotPTR *+bot, idx=Bot; !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:1 (line 85)
R15 26 loadN_ac === 31 23 11 [[ 25 19 ]] #12/0x000000000000000c Volatile!narrowoop: java/lang/Object *
19 unnecessary_membar_acquire === 20 0 23 0 0 |26 0 [[ 18 24 ]] !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:1 (line 85)
18 MachProj === 19 [[ 17 ]] #0/unmatched !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:1 (line 85)
24 MachProj === 19 [[ 17 ]] #2/unmatched Memory: @BotPTR *+bot, idx=Bot; !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:1 (line 85)
R3 25 decodeN_unscaled === _ 26 [[ 17 ]] java/lang/Object * Oop:java/lang/Object * !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:1 (line 85)
34 Epilog === [[ ]] [2380000000034]
17 Ret === 18 12 24 14 15 25 [[ 1 ]]
BB#003:
30 Region === 30 7 [[ 30 4 ]]
R3 16 loadConI16 === 1 [[ 4 ]] #-10/0xfffffff6
6 ConP === 10 [[ 4 ]] #null
4 CallStaticJavaDirect === 30 12 13 14 15 16 0 6 [[ 5 3 32 ]] Static wrapper for: uncommon_trap(reason='null_check' action='maybe_recompile') # void ( int ) C=0.000100 TestImplicitNullChecks::testLoadVolatile @ bci:1 (line 85) reexecute !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:1 (line 85)
5 MachProj === 4 [[ ]] #10005/fat
3 MachProj === 4 [[ 2 ]] #0/unmatched !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:1 (line 85)
R14 32 MachProj === 4 [[ ]] #6/fat
2 ShouldNotReachHere === 3 0 0 14 0 [[ 1 ]] !jvms: TestImplicitNullChecks::testLoadVolatile @ bci:1 (line 85)
I guess it's not worth stepping over the memory barrier. Disabling this rule for PPC64 should be ok, too.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/25066#issuecomment-2886188487
More information about the hotspot-gc-dev
mailing list