RFR: 8373495: C2: Aggressively fold loads from objects that have not escaped
Quan Anh Mai
qamai at openjdk.org
Sat Dec 13 10:44:28 UTC 2025
On Sat, 13 Dec 2025 03:51:32 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:
>> Hi,
>>
>> This patch is an alternative to #28764 but it does the analysis during IGVN instead.
>>
>> Please take a look and leave your thoughts, thanks a lot.
>
> Very nice! I definitely prefer the approach here to #28764.
>
> I see that the unit test stays the same and there's an adjustment in some other test, so I assume this version is functionally more powerful than #28764 version.
>
> Have you had a chance to measure how much it affects compilation speed compared to #28764?
>
> (The code is dense and hard to reason about, so some polishing/refactoring to make it more readable. Also, please, think about verification checks.)
@iwanowww Thanks for your comment. I have added a lot more comments to explain in detail the steps of `MemNode::find_previous_store`. I have also made a small modification: instead of traversing the outputs of the control nodes from the call to the allocation, we traverse the outputs of the nodes that may alias `base` instead. This has some benefits:
- It is likely cheaper. This is because there are often few nodes that may alias `base`, while there may be numerous control nodes from the call to the allocation. The number of nodes that directly use a pointer is also less than the number of nodes that directly use a random control node.
- It is more conservative. This is because we can limit the type of the outputs of a pointer and be conservative with everything else, while exhaustively checking if a random use of a random control node makes `base` escape seems hard.
I have also added some verification that if a step determines that `base` does not escape, then the following steps must not determine otherwise.
For the runtime cost, I don't see a noticeable difference compared to master.
For the unit test, compared to the previous PR, I have removed the `failOn = LoadI` from the tests that involve loops. But I think improving load folding on `Phi` can be another PR. For the change in `TestZGCEffectiveBarrierElision`, it is because I decided to add `Blackhole` to the list of nodes that do not escape an object, not sure if it is necessary, though. However, I managed to change the test so the load is not elided.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/28812#issuecomment-3649201130
More information about the hotspot-compiler-dev
mailing list