Question about the design of FrameState in Graal IR

Thu Feb 8 13:15:07 UTC 2024

On 08/02/2024 08:02, Liu, Xin wrote:
> In section 2 of CGO14 “Partial Escape Analysis and Scalar Replacement 
> for Java”, there’s a sentence:
> 
> “The Graal IR keeps the frame states not at the points where the actual 
> deoptimizations take place, but at the points where side effects may 
> occur.”
> 
> The paper is great, but I can’t find the explanation of this design 
> decision.

The following papers provide some background on this: 
https://dl.acm.org/doi/10.1145/2542142.2542143 and especially 
https://dl.acm.org/doi/10.1145/2647508.2647521

> My intuitive idea is that we generate FrameState at the point where the 
> deoptimization arises. This is the most accurate, isn’t it?

Yes, that is the most accurate. But in the Graal IR we are deliberately 
inaccurate: When we deoptimize, we can restart interpreter execution 
from an earlier point than the root cause of the deoptimization. This 
means that the deoptimization slow path can reexecute some code that was 
already executed in compiled code. This is not observable because such 
code cannot contain side effects.

> [...]
> Obviously, Graal JIT manages to solve this problem. The PEA phrase can 
> move allocations whatever it wants. So far, my understanding is that 
> Graal IR hides those deoptimization nodes in high-tier.

It doesn't hide deoptimizations, but most of them are represented by 
floating guard nodes that can themselves move around the graph and are 
not constrained by frame states during the high tier.

> [...]
> If I am still on track, the problem boils down to how to generate the 
> correct FrameState for those low-level deoptimization nodes. I guess 
> this is why Graal IR designers save FrameStates “at the points where 
> side-effect may occur”.  The late nodes need anchor points. Either Graal 
> generates a new deoptimization node or move someone to this point, it is 
> easy to find the closest ‘FrameState” in reverse control flows, Like 
> magnet.
> 
> All above is my conjunction.  Could Graal experts confirm my 
> understanding? Or could someone point me a literature which describes 
> the rationale of FrameState design?

There are a few state transitions within a Graal compilations that 
change what rules apply. Rougly:

- In the high tier, most guards (null checks, bounds checks, etc.) are
   floating nodes without frame states attached; frame states are
   attached to side effects and control flow merges. PEA only looks at
   this level of representation.
- At some point in the mid tier, we find the best places to anchor the
   previously floating guards and expand them there to fixed nodes to
   express "if (condition) deoptimize" operations. Frame states are still
   at side effects. (See GuardLoweringPhase if interested.)
- At some later point in the mid tier, we perform frame state
   assignment (FrameStateAssignmentPhase): Each deoptimization point gets
   a frame state from the closest dominating side effect. Side effects no
   longer have frame states attached. Different deoptimization points may
   share the same frame state, which allows us to share deoptimization
   metadata (see the second paper linked above).

I think your understanding is broadly correct. From the point of view of 
(our implementation of) PEA you can mostly ignore these details since 
PEA is designed to only run on the highest level representation.

Hope this helps,
Gergö