RFR: 8373495: C2: Aggressively fold loads from objects that have not escaped [v19]
Daniel Lundén
dlunden at openjdk.org
Fri Jan 23 13:30:55 UTC 2026
On Wed, 21 Jan 2026 02:57:17 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:
>> Hi,
>>
>> This patch is an alternative to #28764 but it does the analysis during IGVN instead.
>>
>> ## The current PR:
>>
>> The current escape analysis mechanism is all-or-nothing: either the object does not escape, or it does. If the object escapes, we lose the ability to analyse the values of its fields completely, even if the object only escapes at return.
>>
>> This PR tries to find the escape status of an object at a load, and if it is decided that the object has not escaped there, we can try folding the load aggressively, ignoring calls and memory barriers to find a corresponding store that the load observes. Implementation-wise, when walking at `find_previous_store`, if we encounter a call or memory barrier, we start looking at all nodes that make the allocation escape. If all such nodes have a control input that is not a transitive control input of the call/barrier we are at, then we can decidedly say that the allocation has not escaped at that call/barrier, and walk past that call/barrier to find a corresponding store.
>>
>> I do not see a noticeable difference in C2 runtime with and without this patch.
>>
>> ## Future work:
>>
>> 1. Fold a memory `Phi`.
>>
>> This is pretty straightforward. We need to create a value `Phi` for each memory `Phi` so that we can handle loop `Phi`s.
>>
>> 2. Fold a pointer `Phi`.
>>
>> Currently, this PR is doing the trivial approach, just give up if we don't encounter a store into that `Phi`. However, we can do better. Consider this case:
>>
>> Point p1 = new Point;
>> Point p2 = new Point;
>> p1.x = v1;
>> p2.x = v2;
>> Point p = Phi(p1, p2);
>> int a = p.x;
>>
>> Then, `a` should be able to be folded to `Phi(v1, v2)` if `p1` and `p2` are known not to alias.
>>
>> Another interesting case:
>>
>> Point p = Phi(p1, p2);
>> p.x = v;
>> p1.x = v1;
>> int a = p.x;
>>
>> Then, theoretically, we can fold `a` to `Phi(v1, v)` if `p1` and `p2` are known not to alias.
>>
>> 3. Nested objects
>>
>> It can be observed that if an object is stored into a memory that has not escaped, then it can be considered that the object has not escaped. For example:
>>
>> Point p = new Point;
>> PointHolder h = new PointHolder;
>> h.p = p;
>> int x = p.x;
>> escape(h);
>>
>> Then, `p` can be considered that it has not escaped until `escape(h)`. To do this, the computation of `_aliases` in the constructor of `LocalEA` needs to be more comprehensive. See the comments in `LocalEA::check_escape_status`.
>>
>> Please...
>
> Quan Anh Mai has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 22 additional commits since the last revision:
>
> - Remove the TriBool
> - Merge branch 'master' into loadfoldingigvn
> - Fix dead accesses, address reviews
> - Merge branch 'master' into loadfoldingigvn
> - Early return when not a heap access
> - Fix escape at store
> - Fix outdated and unclear comments
> - copyright year, return, comments, whitespace
> - Merge branch 'master' into loadfoldingigvn
> - ea of phis and nested objects
> - ... and 12 more: https://git.openjdk.org/jdk/compare/93ce8b1a...ac82c2ea
I, @anton-seoane , @sarannat , and @robcasloz had a joint look at this changeset earlier today, and here are some comments.
Our main proposal is that we split this changeset into two: one preparatory and then one adding the actual local escape analysis. The reason is that you have, in addition to your changes, refactored and documented much of the code in `MemNode::find_previous_store`. This is good, of course, but makes reviewing the diff more complicated. Can you first make a separate PR just with the refactoring and documentation additions for the current mainline `MemNode::find_previous_store`?
src/hotspot/share/opto/memnode.cpp line 1426:
> 1424: // Secondly, from the set of allocations that may alias base, collect all nodes that may alias
> 1425: // them, they may alias base as well. Actually, there may be cases that a may alias b and b may
> 1426: // alias c but a may not alias c, but we are conservative here.
We did not get an intuition for why this downwards pass is required in addition to the prior upwards pass. Do you have a test case that illustrates a situation where this is needed?
src/hotspot/share/opto/memnode.cpp line 3879:
> 3877: if (result == this) {
> 3878: // the store may also apply to zero-bits in an earlier object
> 3879: Node* prev_mem = find_previous_store(phase);
It seems your changes to `find_previous_store` could also improve the analysis here for store nodes. Do you have an example illustrating this (or could you add one)?
-------------
PR Review: https://git.openjdk.org/jdk/pull/28812#pullrequestreview-3697453169
PR Review Comment: https://git.openjdk.org/jdk/pull/28812#discussion_r2721175608
PR Review Comment: https://git.openjdk.org/jdk/pull/28812#discussion_r2721185492
More information about the hotspot-compiler-dev
mailing list