RFR: 8327963: C2: fix construction of memory graph around Initialize node to prevent incorrect execution if allocation is removed [v5]

Fri May 16 13:09:03 UTC 2025

On Fri, 16 May 2025 09:28:15 GMT, Roberto Castañeda Lozano <rcastanedalo at openjdk.org> wrote:

> Note that I am not necessarily suggesting disabling "late" elimination of allocations at macro expansion. But it would be good, in light of the above findings, to find actual cases where the seemingly simpler alternative of removing dead allocations early is not sufficient for correctness, to motivate the more complex approach proposed in this PR.

What happens with this bug is that a Phi created sometime after parsing inherits the type of the projection out of the Initialize which is wrong. No issue happens until the allocation is removed though. Only having allocations be removed early one shortens the window where bad things (a new Phi) can happen. But bad things could still happen. After all, we do some loop opts to help EA so maybe a similar issue could happen there. Or maybe, down the road, someone will change the way we do loop opts during EA because it helps EA and the bug will be back but we don't necessarily notice it until it happens at a user's site.

Beyond that, you're suggesting restricting elimination of allocation. What if, down the road, someone notices that it gets in the way of some other optimization? Then that someone we'll have to reconstruct the history here.

There's a history in c2 of fixing issues that are complicated by working around them. Often what happens is that we, later on, realize that the first work around wasn't sufficient and try to pile on more arounds. I also already ran into situations where everything to perform the needed optimization is there but it's disabled for some reason in a particular case and it's unclear why.

I'm in favor of fixing issues once and for all by targeting their root cause and trying to not sacrifice performance whenever possible. Truth is we have a limited view in what people are running though a set of microbenchmarks but we can't be sure something has no performance impact only because that set of microbenchmarks doesn't regress. Beyond that, what can appear as a safer workaround today could be more trouble down the line and in the end cause more confusion and work.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24570#issuecomment-2886673287