Question about the design of FrameState in Graal IR

Thu Feb 8 07:02:37 UTC 2024

Hi, Graal community,

In section 2 of CGO14 “Partial Escape Analysis and Scalar Replacement for Java”, there’s a sentence:
“The Graal IR keeps the frame states not at the points where the actual deoptimizations take place, but at the points where side effects may occur.”

The paper is great, but I can’t find the explanation of this design decision.

My intuitive idea is that we generate FrameState at the point where the deoptimization arises. This is the most accurate, isn’t it?
I believe this is also how C2 IR does. Recently, I reckon many macro nodes are affected by C2’s decision. C2 collapses a linked-list of FrameStates on a function call.
AllocationNode is a CallNode, so new_instance does this to AllocationNode too. That is to say,  an allocation node and all FrameStates bundle. C2 can’t move the allocation
node to another place because FrameStates are position-dependent.  In C2, many other macro nodes such as LockNode, ArrayCopyNode fall into the same awkward situation.
Even compiler knows how,  it can’t move them.

Obviously, Graal JIT manages to solve this problem. The PEA phrase can move allocations whatever it wants. So far, my understanding is that Graal IR hides those deoptimization nodes in high-tier.
My revelation is that an AllocationNode itself doesn’t mean deoptimization. It’s a slowpath led by typecheck failure triggers deoptmization. This slowpath won’t emerge until low-level expansion.
At high-tier,  Graal optimizations can move the high-level nodes freely because low-level nodes that cause deoptimization haven’t been there yet.

Simply put,  allocation node is floating in Graal high-tier whereas it’s a fixed node in C2.

If I am still on track, the problem boils down to how to generate the correct FrameState for those low-level deoptimization nodes. I guess this is why Graal IR designers save FrameStates “at the points where side-effect may occur”.  The late nodes need anchor points. Either Graal generates a new deoptimization node or move someone to this point, it is easy to find the closest ‘FrameState” in reverse control flows, Like magnet.

All above is my conjunction.  Could Graal experts confirm my understanding? Or could someone point me a literature which describes the rationale of FrameState design?

Thanks,
--lx

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/graal-dev/attachments/20240208/af3c286f/attachment-0001.htm>