RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v12]

Roberto Castañeda Lozano rcastanedalo at openjdk.org
Fri Sep 19 13:12:36 UTC 2025


On Tue, 9 Sep 2025 11:27:50 GMT, Roland Westrelin <roland at openjdk.org> wrote:

>> An `Initialize` node for an `Allocate` node is created with a memory
>> `Proj` of adr type raw memory. In order for stores to be captured, the
>> memory state out of the allocation is a `MergeMem` with slices for the
>> various object fields/array element set to the raw memory `Proj` of
>> the `Initialize` node. If `Phi`s need to be created during later
>> transformations from this memory state, The `Phi` for a particular
>> slice gets its adr type from the type of the `Proj` which is raw
>> memory. If during macro expansion, the `Allocate` is found to have no
>> use and so can be removed, the `Proj` out of the `Initialize` is
>> replaced by the memory state on input to the `Allocate`. A `Phi` for
>> some slice for a field of an object will end up with the raw memory
>> state on input to the `Allocate` node. As a result, memory state at
>> the `Phi` is incorrect and incorrect execution can happen.
>> 
>> The fix I propose is, rather than have a single `Proj` for the memory
>> state out of the `Initialize` with adr type raw memory, to use one
>> `Proj` per slice added to the memory state after the `Initalize`. Each
>> of the `Proj` should return the right adr type for its slice. For that
>> I propose having a new type of `Proj`: `NarrowMemProj` that captures
>> the right adr type.
>> 
>> Logic for the construction of the `Allocate`/`Initialize` subgraph is
>> tweaked so the right adr type captured in is own `NarrowMemProj` is
>> added to the memory sugraph. Code that removes an allocation or moves
>> it also has to be changed so it correctly takes the multiple memory
>> projections out of the `Initialize` node into account.
>> 
>> One tricky issue is that when EA split types for a scalar replaceable
>> `Allocate` node:
>> 
>> 1- the adr type captured in the `NarrowMemProj` becomes out of sync
>>   with the type of the slices for the allocation
>>   
>> 2- before EA, the memory state for one particular field out of the
>>   `Initialize` node can be used for a `Store` to the just allocated
>>   object or some other. So we can have a chain of `Store`s, some to
>>   the newly allocated object, some to some other objects, all of them
>>   using the state of `NarrowMemProj` out of the `Initialize`. After
>>   split unique types, the `NarrowMemProj` is for the slice of a
>>   particular allocation. So `Store`s to some other objects shouldn't
>>   use that memory state but the memory state before the `Allocate`.
>>   
>> For that, I added logic to update the adr type of `NarrowMemProj`
>> during split uni...
>
> Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 45 commits:
> 
>  - more
>  - Merge branch 'master' into JDK-8327963
>  - more
>  - more
>  - Merge branch 'master' into JDK-8327963
>  - more
>  - more
>  - lambda return
>  - lambda clean up
>  - Merge branch 'master' into JDK-8327963
>  - ... and 35 more: https://git.openjdk.org/jdk/compare/e16c5100...b701d03e

Changes requested by rcastanedalo (Reviewer).

src/hotspot/share/opto/escape.hpp line 567:

> 565:                         // MemNode       - new memory input for this node
> 566:                         // CheckCastPP   - allocation that this is a cast of
> 567:                         // allocation    - CheckCastPP of the allocation

Please add a new entry here explaining how `_node_map` is used for `NarrowMemProjNode` nodes.

src/hotspot/share/opto/graphKit.cpp line 3645:

> 3643:     assert(minit_out->is_Proj() && minit_out->in(0) == init, "");
> 3644:     int mark_idx = C->get_alias_index(oop_type->add_offset(oopDesc::mark_offset_in_bytes()));
> 3645:     // Add an edge in the MergeMem for the header fields so an access to one of those has correct memory state

Suggestion:

    // Add an edge in the MergeMem for the header fields so an access to one of those has correct memory state.

src/hotspot/share/opto/graphKit.cpp line 3647:

> 3645:     // Add an edge in the MergeMem for the header fields so an access to one of those has correct memory state
> 3646:     // Use one NarrowMemProjNode per slice to properly record the adr type of each slice. The Initialize node will have
> 3647:     // multiple projection as a result.

Suggestion:

    // multiple projections as a result.

src/hotspot/share/opto/macro.cpp line 1606:

> 1604:       // elimination. Simply add the MemBarStoreStore after object
> 1605:       // initialization.
> 1606:       MemBarNode* mb = MemBarNode::make(C, Op_MemBarStoreStore, Compile::AliasIdxRaw);

Does the same argument as below apply for relaxing the scope of this memory barrier? Please clarify in a similar comment for this case (if the same argument applies, a reference to the comment below would be enough).

src/hotspot/share/opto/macro.cpp line 1623:

> 1621:       Node* init_ctrl = init->proj_out_or_null(TypeFunc::Control);
> 1622: 
> 1623:       // What we want is to prevent the compiler and the cpu from re-ordering the stores that initialize this object

Suggestion:

      // What we want is to prevent the compiler and the CPU from re-ordering the stores that initialize this object

src/hotspot/share/opto/macro.cpp line 1628:

> 1626:       // only captures/produces a partial memory state making it complicated to insert such a MemBar. Because
> 1627:       // re-ordering by the compiler can't happen by construction (a later Store that publishes the just allocated
> 1628:       // object reference is indirectly control dependent on the Initialize node), preventing reordering by the cpu is

Suggestion:

      // object reference is indirectly control dependent on the Initialize node), preventing reordering by the CPU is

src/hotspot/share/opto/memnode.hpp line 1383:

> 1381:   bool already_has_narrow_mem_proj_with_adr_type(const TypePtr* adr_type) const;
> 1382: 
> 1383:   MachProjNode* mem_mach_proj() const;

Please add a brief comment above this function, possibly clarifying that we do not expect to find more than one Mach memory projection.

src/hotspot/share/opto/multnode.cpp line 73:

> 71:   };
> 72:   return apply_to_projs(filter, which_proj);
> 73: }

Consider moving this implementation to `multnode.hpp`, perhaps next to that of `MultiNode::apply_to_projs(DUIterator_Fast& imax, DUIterator_Fast& i, Callback callback, uint which_proj)`,  for consistency.

src/hotspot/share/opto/multnode.cpp line 279:

> 277: void NarrowMemProjNode::dump_spec(outputStream *st) const {
> 278:   ProjNode::dump_spec(st);
> 279:   dump_adr_type(st);

Do we need to define a special version of `NarrowMemProjNode::dump_adr_type` or could we just have the same effect calling `MemNode::dump_adr_type(this, _adr_type, st)` here?

src/hotspot/share/opto/multnode.cpp line 284:

> 282: void NarrowMemProjNode::dump_compact_spec(outputStream *st) const {
> 283:   ProjNode::dump_compact_spec(st);
> 284:   dump_adr_type(st);

Same here.

src/hotspot/share/opto/multnode.hpp line 71:

> 69:     }
> 70:     Node* current() {
> 71:       return _node->fast_out(_i);;

Suggestion:

      return _node->fast_out(_i);

src/hotspot/share/opto/multnode.hpp line 90:

> 88:     }
> 89:     Node* current() {
> 90:       return _node->out(_i);;

Suggestion:

      return _node->out(_i);

src/hotspot/share/opto/phaseX.cpp line 2621:

> 2619:       add_users_to_worklist0(proj, worklist);
> 2620:       return MultiNode::CONTINUE;
> 2621:     };

Consider defining `enqueue` only once and reusing it in both cases.

test/hotspot/jtreg/compiler/escapeAnalysis/TestIterativeEA.java line 53:

> 51:     analyzer.shouldContain("++++ Eliminated: 26 Allocate");
> 52:     analyzer.shouldContain("++++ Eliminated: 51 Allocate");
> 53:     analyzer.shouldContain("++++ Eliminated: 84 Allocate");

Did you analyze why there are more allocations removed than before in this test case? I did not expect this changeset to have an effect on the number of removed allocations.

test/hotspot/jtreg/compiler/macronodes/TestEarlyEliminationOfAllocationWithoutUse.java line 1:

> 1: /*

Please add a package declaration (and make the corresponding class names fully qualified in the `@run` directives).

test/hotspot/jtreg/compiler/macronodes/TestEliminationOfAllocationWithoutUse.java line 30:

> 28:  *          Now that array slice depends on the rawslice. And then when the Initialize MemBar gets
> 29:  *          removed in expand_allocate_common, the rawslice sees that it has now no effect, looks
> 30:  *          through the MergeMem and sees the initial stae. That way, also the linked array slice

Suggestion:

 *          through the MergeMem and sees the initial state. That way, also the linked array slice

-------------

PR Review: https://git.openjdk.org/jdk/pull/24570#pullrequestreview-3244667543
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362830370
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362759304
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362760441
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362798596
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362800147
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362800934
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362782847
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362757140
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362743051
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362743403
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362746650
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362750245
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362767659
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362816473
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362810978
PR Review Comment: https://git.openjdk.org/jdk/pull/24570#discussion_r2362745517


More information about the hotspot-compiler-dev mailing list