RFC: Partial Escape Analysis in HotSpot C2

Thu Oct 6 17:42:19 UTC 2022

Hi,

We would like to pursuit PEA in HotSpot. I spent time thinking how to
adapt Stadler's Partial Escape Analysis[1] to C2. I think there are 3
elements in it. 1) flow-sensitive escape analysis 2) lazy code motion
for the allocation and initialization 3) on-the-fly scalar replacement.
The most complex part is 3) and it has done by C2. I'd like to leverage
that, so I come up an idea to focus only on escaped objects in the
algorithm and delegate others to the existing C2 phases. Here is my RFC.
May I get your precious time on this?

https://gist.github.com/navyxliu/62a510a5c6b0245164569745d758935b#rfc-partial-escape-analysis-in-hotspot-c2

The idea is based on the following two observations.

1. Stadler's PEA can cooperate with C2 EA/SR.

If an object moves to the place it is about to escape, it won't impact
C2 EA/SR later. It's because it will be marked as 'GlobalEscaped'. C2 EA
won't do anything for it anyway.

If PEA don't touch a non-escaped object, it won't change its
escapability. It can punt it to C2 EA/SR and the result is still same.

2. The original AllocationNode is either dead or scalar replaceable
after Stadler's PEA.

Stadler's algorithm virtualizes an allocation Node and materializes it
on demand. There are 2 places to materialize it. 1) the virtual object
is about to escape 2) MergeProcessor needs to merge an object and at
least one of its predecessor has materialized. MergeProcessor has to
materialize all virtual objects in other predecessors([1] 5.3, Merge nodes).

We can prove the observation 2 using 'proof of contradiction' here.
Assume the original Allocation node is neither dead nor Scalar Replaced
after Stadler's PEA, and program is still correct.

Program must need the original allocation node somewhere. The algorithm
has deleted the original allocation node in virtualization step and
never bring it back. It contradicts that the program is still correct. QED.

If you're convinced, then we can leverage it. In my design, I don't
virtualize the original node but just leave it there. C2 MacroExpand
phase will take care of the original allocation node as long as it's
either dead or scalar-replaceable. It never get a chance to expand.

If we restrain on-the-fly scalar replacement in Stadler's PEA, we can
delegate it to C2 EA/SR! There are 3 gains:

1) I don't think I can write bug-free Scalar Replacement...
2) This approach can automatically pick up C2 EA/SR improvements in the
future, such as JDK-8289943.
3) If we focus only on 'escaped objects', we even don't need to deal
with deoptimization. Only 'scalar replaceable' objects need to save
Object states for deoptimization. Escaped objects disqualify that.

[1]: Stadler, Lukas, Thomas Würthinger, and Hanspeter Mössenböck.
"Partial escape analysis and scalar replacement for Java." Proceedings
of Annual IEEE/ACM International Symposium on Code Generation and
Optimization. 2014.

thanks,
--lx
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xB9D934C61E047B0D.asc
Type: application/pgp-keys
Size: 3675 bytes
Desc: OpenPGP public key
URL: <https://mail.openjdk.org/pipermail/hotspot-compiler-dev/attachments/20221006/4329f7bf/OpenPGP_0xB9D934C61E047B0D.asc>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 665 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-compiler-dev/attachments/20221006/4329f7bf/OpenPGP_signature.sig>