[EXTERNAL]RFC: Partial Escape Analysis in HotSpot C2

Fri Oct 7 00:46:38 UTC 2022

On 10/6/22 5:09 PM, Liu, Xin wrote:
> hi, Ignor,
> 
> You are right. Cloning the JVMState of original Allocation Node isn't
> the correct behavior. I need the JVMState right at materialization. I
> think it is available because we are in parser. For 2 places of
> materialization:
> 1) we are handling the bytecode which causes the object to escape. It's
> probably putfield/return/invoke. Current JVMState it is.
> 2) we are in MergeProcessor. We need to materialize a virtual object in
> its predecessors. We can extract the exiting JVMState from the
> predecessor Block.
> 
> I just realize maybe that's the one of the reasons Graal saves
> 'FrameState' at store nodes. Graal needs to revisit the 'FrameState'
> when its PEA phase does materialization in high-tier.
> 
> Apart from safepoint, there's one corner case bothering me. JLS says
> that creation of a class instance may throw an
> OOME.(https://docs.oracle.com/javase/specs/jls/se19/html/jls-15.html#jls-15.9.4)
> 
> "
> space is allocated for the new class instance. If there is insufficient
> space to allocate the object, evaluation of the class instance creation
> expression completes abruptly by throwing an OutOfMemoryError.
> "
> 
> and it's cross-referenced by bytecode new in JVMS
> https://docs.oracle.com/javase/specs/jvms/se19/html/jvms-6.html#jvms-6.5.new
> 
> If we have moved the Allocation Node and JVM happens to run out of
> memory, the first frame of stacktrace will drift a little bit, right?
> The bci and source linenum will be wrong. Does it matter? I can't
> imagine that user's programs rely on this information.

This is not new [1]. C2 EA implementation has this OOM stacktrace "issue". Graal has it too.

Thanks,
Vladimir K

[1] https://bugs.openjdk.org/browse/JDK-8063642

> 
> I think it's possible to amend this bci/line number in JVMState level. I
> will leave it as an open question and revisit it later.
> 
> Do I understand your concern? if it makes sense to you, I will update
> the RFC doc.
> 
> thanks,
> --lx
> 
> 
> 
> 
> On 10/6/22 3:00 PM, Igor Veresov wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> Hi,
>>
>> You say that when you materialize the clone you plan to have the same jvm state as the original allocation. How is that possible in a general case? There can be arbitrary changes of state between the original allocation point and where the clone materializes.
>>
>> Igor
>>
>>> On Oct 6, 2022, at 10:42 AM, Liu, Xin <xxinliu at amazon.com> wrote:
>>>
>>> Hi,
>>>
>>> We would like to pursuit PEA in HotSpot. I spent time thinking how to
>>> adapt Stadler's Partial Escape Analysis[1] to C2. I think there are 3
>>> elements in it. 1) flow-sensitive escape analysis 2) lazy code motion
>>> for the allocation and initialization 3) on-the-fly scalar replacement.
>>> The most complex part is 3) and it has done by C2. I'd like to leverage
>>> that, so I come up an idea to focus only on escaped objects in the
>>> algorithm and delegate others to the existing C2 phases. Here is my RFC.
>>> May I get your precious time on this?
>>>
>>> https://gist.github.com/navyxliu/62a510a5c6b0245164569745d758935b#rfc-partial-escape-analysis-in-hotspot-c2
>>>
>>> The idea is based on the following two observations.
>>>
>>> 1. Stadler's PEA can cooperate with C2 EA/SR.
>>>
>>> If an object moves to the place it is about to escape, it won't impact
>>> C2 EA/SR later. It's because it will be marked as 'GlobalEscaped'. C2 EA
>>> won't do anything for it anyway.
>>>
>>> If PEA don't touch a non-escaped object, it won't change its
>>> escapability. It can punt it to C2 EA/SR and the result is still same.
>>>
>>>
>>> 2. The original AllocationNode is either dead or scalar replaceable
>>> after Stadler's PEA.
>>>
>>> Stadler's algorithm virtualizes an allocation Node and materializes it
>>> on demand. There are 2 places to materialize it. 1) the virtual object
>>> is about to escape 2) MergeProcessor needs to merge an object and at
>>> least one of its predecessor has materialized. MergeProcessor has to
>>> materialize all virtual objects in other predecessors([1] 5.3, Merge nodes).
>>>
>>> We can prove the observation 2 using 'proof of contradiction' here.
>>> Assume the original Allocation node is neither dead nor Scalar Replaced
>>> after Stadler's PEA, and program is still correct.
>>>
>>> Program must need the original allocation node somewhere. The algorithm
>>> has deleted the original allocation node in virtualization step and
>>> never bring it back. It contradicts that the program is still correct. QED.
>>>
>>>
>>> If you're convinced, then we can leverage it. In my design, I don't
>>> virtualize the original node but just leave it there. C2 MacroExpand
>>> phase will take care of the original allocation node as long as it's
>>> either dead or scalar-replaceable. It never get a chance to expand.
>>>
>>> If we restrain on-the-fly scalar replacement in Stadler's PEA, we can
>>> delegate it to C2 EA/SR! There are 3 gains:
>>>
>>> 1) I don't think I can write bug-free Scalar Replacement...
>>> 2) This approach can automatically pick up C2 EA/SR improvements in the
>>> future, such as JDK-8289943.
>>> 3) If we focus only on 'escaped objects', we even don't need to deal
>>> with deoptimization. Only 'scalar replaceable' objects need to save
>>> Object states for deoptimization. Escaped objects disqualify that.
>>>
>>> [1]: Stadler, Lukas, Thomas Würthinger, and Hanspeter Mössenböck.
>>> "Partial escape analysis and scalar replacement for Java." Proceedings
>>> of Annual IEEE/ACM International Symposium on Code Generation and
>>> Optimization. 2014.
>>>
>>> thanks,
>>> --lx
>>> <OpenPGP_0xB9D934C61E047B0D.asc>