RFR(M): 7197327: 40% regression on 8 b41 comp 8 b40 on specjvm2008.mpegaudio on oob

Tue Jan 29 15:54:18 PST 2013

I forgot to add that we need a new diagnostic C2 flag for these changes 
(check it in add_expensive_node()) to be able switch it off.

Thanks,
Vladimir

On 1/29/13 3:45 PM, Vladimir Kozlov wrote:
> Thank you, Roland
>
> This looks good.
>
> On 1/29/13 7:48 AM, Roland Westrelin wrote:
>> Here is a new webrev:
>>
>> http://cr.openjdk.java.net/~roland/7197327/webrev.01/
>
> compile.cpp:
>
> Could you use err_msg_res() to print req() and name of nodes in next
> assert?:
>
> +  assert(n1->req() == n2->req(), "can't compare nodes?");
>
> You have several places where you do next check, can you factor it into
> a method?:
>
> (n->outcnt() != 0 && igvn.type(n) != Type::TOP && !n->in(0)->is_top())
>
> Why in _expensive_nodes loops you start from i=1 and use (i-1) index
> instead of i=0 and (i+1)?
>
> Why cleanup_expensive_nodes() is not using cmp_expensive_nodes()? It
> will reduce time spent in should_optimize_expensive_nodes() later. May
> be it should call should_optimize_expensive_nodes() and if it returns
> 'true' do cleanup using cmp_expensive_nodes(). And if it returns 'false'
> emptying whole list (and set NULL control).
>
> +  // control input of nodes for which there's a single copy.
>                                                ^ only
>
> We put declarations on separate lines:
> +  int j = 0, identical = 0, i = 1;
>
> cleanup_expensive_nodes() and should_optimize_expensive_nodes() should
> check _expensive_nodes->length() at the beginning.
>
> Add assert to add_expensive_node() to make sure it is not memory or cfg
> nodes.
>
> loopnode.cpp:
>
> Could you fix codding style (spaces in if() and for()) in
> PhaseIdealLoop::get_early_ctrl) since you touch it?
>
> We need more info if graph is broken. Could you dump_bad_graph() here:
> +  assert(is_dominator(ctl, earliest) || is_dominator(earliest, ctl),
> "broken graph?");
>
> Initialize next=ctrl otherwise some compilers may complain.
> Add checks (ctl->in(1) != NULL && ctl->in(1)->in(0) != NULL) for dead
> loop when ctl->is_Loop(). Also check that loop is not irreducible,
> otherwise you can't do this optimization. May be do it only for
> is_CountedLoop() or when there are not irreducible loops in code.
>
> Assign to local var ctl->in(0) and check for NULL.
>
> Add comment to the next code (what nodes you expect here? Can you add
> assert() there to verify that this path is executed only for expected
> nodes) and fix code style:
>
> +        int nb_ctl_proj = 0;
> +        for( DUIterator_Fast imax, i = ctl->in(0)->fast_outs(imax); i <
> imax; i++ ) {
> +          Node *p = ctl->in(0)->fast_out(i);
> +          if( p->is_Proj() && p->is_CFG() ) {
> +            nb_ctl_proj++;
> +            if (nb_ctl_proj > 1) {
> +              break;
>
>
> process_expensive_nodes():
>
> I think for the case (parent_c1 == parent_c2) you should also push n1,
> n2 on _worklist otherwise they may not be processed by IGVN.
>
>>
>>> What is effect on ref workload?
>>
>> I haven't done a refworkload on this latest version but on a previous
>> one, refworkload shown nothing of interests except the improvement on
>> specjvm2008's mpegaudio. I'll redo refworkload before the push.
>
> OK.
>
>> The 2 tests below cannot work because they don't guarantee that c2 is
>> in the other If branch or that c2 is not on the other branch but
>> inside an imbricated If:
>
> You are right. Your code is correct but I would simplify it:
>
> if (parent_c1->is_Proj() && parent_c1->in(0)->is_If() &&
>      parent_c2->is_Proj() && parent_c1->in(0) == parent_c2->in(0)) {
>
> Thanks,
> Vladimir
>
>>
>>> } else if (c1->is_Proj() && c1->in(0)->is_If() &&
>>>             is_dominator(c1->in(0), c2)) {
>>>    c1 = c2 = c1->in(0);
>>> } else if (c2->is_Proj() && c2->in(0)->is_If() &&
>>>             is_dominator(c2->in(0), c1)) {
>>>    c1 = c2 = c2->in(0);
>>> }
>>> if (n1->in(0) != c1) {
>>>    n1->set_req(0, c1);
>>> }
>>> if (n2->in(0) != c2) {
>>>    n2->set_req(0, c2);
>>> }
>>
>> Roland.
>>