[EXTERNAL][EXTERNAL]RFC: Partial Escape Analysis in HotSpot C2

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Fri Oct 21 00:26:56 UTC 2022


Hi,

> I would like to update on this. I manage to get PEA work in Vladimir
> Ivanov's testcase. I put the testcase, assembly and graphs here[1].
> 
> Even though it is a quite simple case, I think it demonstrates that the
> RFC is practical in C2. I proposed 3 major differences from Graal.

Nice! Also, a very similar (but a much more popular case) should be 
escape sites in catch blocks (as reported by JDK-8267532 [1]).

> 1. The algorithm runs in parser instead of optimizer.
> 2. Prefer clone-and-eliminate strategy rather than
> virtualize-and-materialize.
> 3. Refrain from scalar replacement on-the-fly.

I don't understand how you plan to implement it solely during parsing.
You could do some bookkeeping during parsing and capture JVM state, but 
I don't see how to do EA that early.

Also, please, elaborate on #3. It's not clear to me what do you mean there.

> The test excises them all. I pasted 3 graphs here[2]. When we
> materialize an object, we just clone it with the right JVMState. It
> shows that C2 IterEA can automatically picks up the obsolete object and
> get rid of it, as we expected.
> 
> It turns out cloning an object isn't as complex as I thought. I mainly
> spent time on adjusting JVMState for the cloned AllocateNode. Not only
> to call sync_jvm(), I also need to 1) kill dead locals 2) clean stack
> and even avoid reexecution that bci.
> 
>    JVMState* jvms = parser->sync_jvms();
>    SafePointNode* map = jvms->map();
>    parser->kill_dead_locals();
>    parser->clean_stack(jvms->sp());
>    jvms->set_should_reexecute(false);
> 
> Clearly, the algorithm hasn't completed yet. I am still working on
> MergeProcessor, general classes fields and loop construct.

There was a previous discussion on PEA for C2 back in 2021 [2] [3]. One 
interesting observation related to your current experiments was:

"4. Escape sites separate the graph into 2 parts: before and after the
instance escapes. In order to preserve identity invariants (and avoid
identity paradoxes), PEA can't just put an allocation at every escape
site. It should respect the order of escape events and ensure that the
very same object is observed when multiple escape events happen.

Dynamic invariant can be formulated as: there should never be more than
1 allocation at runtime per 1 eliminated allocation.

Considering non-escaping operations can force materialization on their
own, it poses additional constraints."

So, when you clone an allocation, you should ensure that only a single 
instance can be observed. And safepoints can be escape points as well 
(rematerialization in case of deoptimization event).

> I haven't figured out how to test PEA in a reliable way. It is not easy
> for IR framework to capture node movement. If we measure allocation
> rate, it will be subject to CPU capability and also the sampling rate. I
> came up with an idea so-called 'Epsilon-Test'. We create a JVM with
> EpsilonGC and a fixed Java heap. Because EpsilonGC never replenish the
> java heap, we can count how many iterations a test can run before OOME.
> The less allocation made in a method, the more iterations HotSpot can
> execute the method. This isn't perfect either. I found that hotspot
> can't guarantee to execute the final-block in this case[3]. So far, I
> just measure execution time instead.

It sounds more like a job for benchmarks, but focused on measuring 
allocation rate (per iteration). ("-prof gc" mode in JMH terms.)

Personally, I very much liked the IR framework-based approach Cesar used 
in the unit test for allocation merges [4]. Do you see any problems with 
that?

Best regards,
Vladimir Ivanov

[1] https://bugs.openjdk.org/browse/JDK-8267532
[2] 
https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2021-May/047486.html
[3] 
https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2021-May/047536.html
[4] https://github.com/openjdk/jdk/pull/9073


> 
> Appreciate your feedbacks or you spot any redflag.
> 
> [1] https://gist.github.com/navyxliu/9c325d5c445899c02a0d115c6ca90a79
> 
> [2]
> https://gist.github.com/navyxliu/9c325d5c445899c02a0d115c6ca90a79?permalink_comment_id=4341838#gistcomment-4341838
> 
> [3] https://gist.github.com/navyxliu/9c325d5c445899c02a0d115c6ca90a79#file-example1-java-L43
> 
> thanks,
> --lx
> 
> 
> 
> 
> On 10/12/22 11:17 AM, Vladimir Kozlov wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> On 10/12/22 7:58 AM, Liu, Xin wrote:
>>> hi, Vladimir,
>>>> You should show that your implementation can rematirealize an object
>>> at any escape site.
>>>
>>> My understanding is I suppose to 'materialize' an object at any escape site.
>>
>> Words ;^)
>>
>> Yes, I mistyped and misspelled.
>>
>> Vladimir K
>>
>>>
>>> 'rematerialize' refers to 'create an scalar-replaced object on heap' in
>>> deoptimization. It's for interpreter as if the object was created in the
>>> first place. It doesn't apply to an escaped object because it's marked
>>> 'GlobalEscaped' in C2 EA.
>>>
>>>
>>> Okay. I will try this idea!
>>>
>>> thanks,
>>> --lx
>>>
>>>
>>>
>>>
>>> On 10/11/22 3:12 PM, Vladimir Kozlov wrote:
>>>> Also in your test there should be no merge at safepoint2 because `obj` is "not alive" (not referenced) anymore.


More information about the hotspot-compiler-dev mailing list