Re: 回复:[Patch] 8210853: C2 doesn't skip post barrier for new allocated objects

Kuai Wei kuaiwei.kw at alibaba-inc.com
Fri Oct 19 07:56:11 UTC 2018


Hi Vladimir,

  Thanks for testing the change. It's good to me to move input check from assert to condition.
I also tested with specjbb2015. The result is
                                                        enable         disable 
jbb2015.result.metric.max-jOPS           73234          71531      2.38%
jbb2015.result.metric.critical-jOPS        31256          30293      3.18%
jbb2015.result.SLA-10000-jOPS           16605          14902      11.43%
jbb2015.result.SLA-25000-jOPS            21715          23503      -7.61%
jbb2015.result.SLA-50000-jOPS            38076          34630       9.95%
jbb2015.result.SLA-75000-jOPS            43004          43145      -0.33%
jbb2015.result.SLA-100000-jOPS          50525          48751       3.64%

 The jvm option I used is "-XX:+UseG1GC -XX:+UnlockExperimentalVMOptions -Xmx100g -Xms100g -XX:MaxGCPauseMillis=300 -XX:G1NewSizePercent=8 -XX:G1MaxNewSizePercent=50"
It can improve overall performance. And there's one test suite with performance degradation. I'm not sure why it happen.

Thanks,
Kevin


------------------------------------------------------------------
From:Vladimir Kozlov <vladimir.kozlov at oracle.com>
Send Time:2018年10月17日(星期三) 05:16
To:蒯微(麦庶) <kuaiwei.kw at alibaba-inc.com>; Tobias Hartmann <tobias.hartmann at oracle.com>; hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
Cc:李三红(三红) <sanhong.lsh at alibaba-inc.com>
Subject:Re: 回复:[Patch] 8210853: C2 doesn't skip post barrier for new allocated objects

On 10/16/18 4:56 AM, Kuai Wei wrote:
> Hi Vladimir,
> 
>    About the copy region, I made this fix to check if region node is copy. In my test case, the 
> region is degraded to copy and
> post write barrier is removed. I haven't checked the detail of the transform. I think there's an 
> ideal phase between inlining and
> parsing store oop. So the phi node is removed and region node is degraded to copy.
> 
>    We tested the fix in several online trade systems. We can check cpu usage difference between 
> machines with same load. I will
> test with some benchmark tests and give the score.

Sounds good. Lets see if it helps in your production system.
I will submit our internal testing with your latest fix with small change. I moved req() == 2 check 
from assert to condition to narrow cases (copy Region may have > 2 inputs with only 1 not-null input):

http://cr.openjdk.java.net/~kvn/8210853/webrev.00/

Vladimir

> 
> Thanks,
> Kevin
> 
>     ------------------------------------------------------------------
>     From:Vladimir Kozlov <vladimir.kozlov at oracle.com>
>     Send Time:2018年10月16日(星期二) 08:45
>     To:蒯微(麦庶) <kuaiwei.kw at alibaba-inc.com>; Tobias Hartmann <tobias.hartmann at oracle.com>;
>     hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
>     Cc:李三红(三红) <sanhong.lsh at alibaba-inc.com>
>     Subject:Re: 回复:[Patch] 8210853: C2 doesn't skip post barrier for new allocated objects
> 
>     Hi Kevin,
> 
>     On 10/4/18 8:18 AM, Kuai Wei wrote:
>      > Hi Vladimir,
>      >
>      >    I'm not sure about the phi node here. Is it used for merging allocation in fast path and slow path?
>      > In parsing phase, the allocation node is not expanded, so there's no this phi node. If my understand
>      > is wrong, please correct me. So far as I know, the region node is created while inlining initialize
>      > method
>      > of super type. And phi node is not necessary. But the region node is always created.
> 
>     It is not related to slow and fast path of allocation. It is general rule to have Phi nodes when new
> 
>     Region node is created. Like here:
> 
>     http://hg.openjdk.java.net/jdk/jdk/file/04d4f1e4aff2/src/hotspot/share/opto/parse1.cpp#l772
> 
>      >
>      >    I understand your concern. But write barrier is added in parse phase, we must check it in parse time
>      > to do this optimization. If the region node is a copy, could we assume it will not have additional
>      > input?
> 
>     Yes, I think it is true. Copy Region never goes back to normal (merge) Region.
> 
>      > I checked region node, a copy region node is created when only one input edge and compiler can not
>      > reshape, the region node will be degraded to a copy. I think it will be dead in future phase.
> 
>     Yes. But does your case have a 'copy' Region? I originally thought that you have normal Region with
>     just one input.
> 
>      >
>      >    In our trade system, this optimization could improve 5%+ performance. It's a big improvement and we
>      > don't want to lose it. If there's better solution, we are pleasure to implement it.
> 
>     Yes, I agree that we should fix it.
> 
>     What testing did you perform?
> 
>     Thanks,
>     Vladimir
> 
>      >
>      > Thanks,
>      > Kevin
>      >
>      >     ------------------------------------------------------------------
>      >     From:Vladimir Kozlov <vladimir.kozlov at oracle.com>
>      >     Send Time:2018年10月4日(星期四) 06:50
>      >     To:蒯微(麦庶) <kuaiwei.kw at alibaba-inc.com>; Tobias Hartmann <tobias.hartmann at oracle.com>;
>      >     hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
>      >     Cc:李三红(三红) <sanhong.lsh at alibaba-inc.com>
>      >     Subject:Re: 回复:[Patch] 8210853: C2 doesn't skip post barrier for new allocated objects
>      >
>      >     Sorry, I mean
>      >
>      >     "should exist Phi *associated* with allocation"
>      >
>      >     On 10/3/18 3:48 PM, Vladimir Kozlov wrote:
>      >      > Hi Kevin,
>      >      >
>      >      > You correctly pointed in your analysis that if base of address is pointing to allocation there
>      >      > should not be other control path to this store. But it also means that before there should exist Phi
>      >      > not associated with allocation when Region node was added. If Phi node was removed at some time
>      >      > Region node should be removed too (or not add it in first place). Please, look on that.
>      >      >
>      >      > I share Tobias's concern about skipping Region node in Parse phase when IR graph is still constructed.
>      >      >
>      >      > Thanks,
>      >      > Vladimir
>      >      >
>      >      > On 10/3/18 8:08 AM, Kuai Wei wrote:
>      >      >> Hi Tobias,
>      >      >>
>      >      >>   I made the change to check with RegionNode::is_copy, could you check the new patch?
>      >      >>
>      >      >> diff --git a/src/hotspot/share/opto/graphKit.cpp b/src/hotspot/share/opto/graphKit.cpp
>      >      >> index 068141f..884a76c 100644
>      >      >> --- a/src/hotspot/share/opto/graphKit.cpp
>      >      >> +++ b/src/hotspot/share/opto/graphKit.cpp
>      >      >> @@ -2126,7 +2126,15 @@ void GraphKit::uncommon_trap(int trap_request,
>      >      >>   // We use this to determine if an object is so "fresh" that
>      >      >>   // it does not require card marks.
>      >      >>   Node* GraphKit::just_allocated_object(Node* current_control) {
>      >      >> -  if (C->recent_alloc_ctl() == current_control)
>      >      >> +  Node * ctrl = current_control;
>      >      >> +  // Object::<init> is invoked after allocation, most of invoke nodes
>      >      >> +  // will be reduced, but a region node is kept in parse time, we check
>      >      >> +  // the pattern and skip the region node
>      >      >> +  if (ctrl != NULL && ctrl->is_Region() && ctrl->as_Region()->is_copy()) {
>      >      >> +    assert(ctrl->req() == 2, "copy region has only 2 inputs");
>      >      >> +    ctrl = ctrl->as_Region()->is_copy();
>      >      >> +  }
>      >      >> +  if (C->recent_alloc_ctl() == ctrl)
>      >      >>       return C->recent_alloc_obj();
>      >      >>     return NULL;
>      >      >>   }
>      >      >> Thanks,
>      >      >> Kevin
>      >      >>
>      >      >>     ------------------------------------------------------------------
>      >      >>     From:蒯微(麦庶) <kuaiwei.kw at alibaba-inc.com>
>      >      >>     Send Time:2018年9月25日(星期二) 21:50
>      >      >>     To:Tobias Hartmann <tobias.hartmann at oracle.com>; hotspot compiler
>      >      >>     <hotspot-compiler-dev at openjdk.java.net>
>      >      >>     Cc:李三红(三红) <sanhong.lsh at alibaba-inc.com>
>      >      >>     Subject:回复:回复:
>     [Patch] 8210853: C2 doesn't skip post barrier for new allocated objects
>      >      >>
>      >      >>     Hi Tobias,
>      >      >>
>      >      >>        Thanks for your comments. I will check RegionNode::is_copy to see if it can be used to detect
>      >      >>     unnecessary region node. I will send new review after testing.
>      >      >>
>      >      >>     Best Regards,
>      >      >>     Kevin
>      >      >>
>      >      >>     ------------------------------------------------------------------
>      >      >>     发件人:Tobias Hartmann <tobias.hartmann at oracle.com>
>      >      >>     发送时间:2018年9月24日(星期一) 21:34
>      >      >>     收件人:蒯微(麦庶) <kuaiwei.kw at alibaba-inc.com>; hotspot compiler
>      >      >>     <hotspot-compiler-dev at openjdk.java.net>
>      >      >>     抄 送:李三红(三红) <sanhong.lsh at alibaba-inc.com>
>      >      >>     主 题:Re: 回复:
>     [Patch] 8210853: C2 doesn't skip post barrier for new allocated objects
>      >      >>
>      >      >>     Hi Kevin,
>      >      >>
>      >      >>     On 24.09.2018 08:06, Kuai Wei wrote:
>      >      >>
>      >      >> >   Thanks for your suggestion. I think your point is the region node may have new path in later parse
>      >      >>
>      >      >>      > phase, so we can not make sure the region node will be optimized.
>      >      >>
>      >      >>
>      >      >> Yes, my point is that a new path to the region might be added after your optimization and that path
>      >      >>     might contain stores to the newly allocated object.
>      >      >>
>      >      >>      >   It's a good question and I checked it. Now I think it may not cause trouble. In post barrier
>      >      >>      > reduce, the oop store use allocation node as base pointer. The data graph guarantee control of
>      >      >>
>      >      >> > allocation node should dominate control of store. If allocation node is in pred of region node and
>      >      >>
>      >      >> > there's a new path into region, the graph is bad because we can reach store without allocation.
>      >      >>
>      >      >>     Yes but the new path might be a backedge from a loop that is dominated by the allocation.
>      >      >>
>      >      >>
>      >      >> > If allocation node is in a domination ancestor, the graph shape is a little complicated, so we can not
>      >      >>
>      >      >>      > reach control of allocation by skipping one region.
>      >      >>
>      >      >>
>      >      >> Right, that's basically the implicit assumption of your patch. I'm not sure if it always holds. But
>      >      >>     I think you should at least use RegionNode::is_copy().
>      >      >>
>      >      >>     Let's see what other reviewers think.
>      >      >>
>      >      >>
>      >      >> >   The better solution is we can know the region node is created in exit_map and we will not change
>      >      >>      > it in later. Is there any way to know it in compile time?
>      >      >>
>      >      >>
>      >      >> The region node is created in Parse::build_exits(). I don't think there is a way to keep track of this.
>      >      >>
>      >      >>
>      >      >>     Thanks,
>      >      >>     Tobias
>      >      >>
>      > 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20181019/a7ab6fea/attachment-0001.html>


More information about the hotspot-compiler-dev mailing list