RFR(M): 7197327: 40% regression on 8 b41 comp 8 b40 on specjvm2008.mpegaudio on oob

Vladimir Kozlov vladimir.kozlov at oracle.com
Tue Jan 22 02:54:54 PST 2013


On 1/22/13 12:39 AM, Roland Westrelin wrote:
> Hi Vladimir,
>
> Thanks for reviewing this.
>
>> What is effect on ref workload?
>
> I haven't tried yet.
>
>> I would also use it for trigonometric and math (sqrt, log) intrinsics (as separate RFE after checking performance effect).
>
> Ok. I will file an RFE.
>
>> I like you factoring of sorting code in sources.
>
> It doesn't look like you like how it's used in the new code so I guess I'll file an RFE for the clean-up if it's no longer related to this change.
>
>> Why ciConstantPoolCache::_keys is array of intptr_t and not int type elements? We'r wasting space in 64bit VM and do conversion each time we access elements.
>
> I'll file an RFE for this one as well.
>
>> In growableArray.hpp call f(value, key) only once.
>>
>> In loopnode.cpp you miss 'return;' in second "if (stop_early)".
>>
>> It is very expensive to do build_loop_* when stop_early is true. Could you check before that if you have Expensive nodes with the same data inputs instead of just number of such nodes? Your list sorting should be cheaper. May be instead of flat list you should create list of pairs which have same data inputs, it will simplify processing code.
>
> I don't understand what pairs you are referring to. There can be more than 2 identical expensive nodes in a method, right?

All combinations: 1-2,2-3,1-3. Will need additional check in 
process_expensive_nodes() to do nothing if c1==c2 (got it from 
processing previous pairs).

>
>> Anyway, I think new process_expensive_nodes() method is too complex for cases which are very very rare. I think you should narrow cases for this optimization:
>
> Are you suggesting I should drop all the process_expensive_nodes() and replace it by the code below that would be applied 2 by 2 to pair of nodes with same data inputs?
>
>> c1 = get_control(n1);
>> c2 = get_control(n2);
>>
>> if (is_dominator(c1, c2)) {
>>    c2 = c1;
>> } else if (is_dominator(c2, c1)) {
>>    c1 = c2;
>> } else if (c1->is_Proj() && c1->in(0)->is_If() &&
>>             is_dominator(c1->in(0), c2)) {
>>    c1 = c2 = c1->in(0);
>> } else if (c2->is_Proj() && c2->in(0)->is_If() &&
>>             is_dominator(c2->in(0), c1)) {
>>    c1 = c2 = c2->in(0);
>> }
>> if (n1->in(0) != c1) {
>>    n1->set_req(0, c1);
>> }
>> if (n2->in(0) != c2) {
>>    n2->set_req(0, c2);
>> }
>
>
> My code has 3 steps:
>
> 1- Move the node up the dominator tree along paths with same frequency

In my second mail I suggested to do it in build_loop_early() method 
where you have early_ctrl for data inputs.

> 2- Handle If with same computation in both branches
> 3- Check for computation that dominates another computation
>
> 1- and 2- is done in a loop until no more progress.
>
> You code above does 2- and 3-. Are you suggesting 1- can be dropped? Or that there's no need for 1- and 2- as separate steps in a loop?
>
> I wrote the code in process_expensive_nodes() so that it optimizes this correctly:

Which should collapse to calls only before loop. Right?

Thanks,
Vladimir

>
>      static double m1(double a, double b) {
>          double r1 = 0, r2 = 0, r3 = 0;
>
>          for(int i = 0; i < 10000; i++) {
>              if (a > 0) {
>                  if (b > 0) {
>                      if (b > a) {
>                          r3 = Math.pow(a,b);
>                          r2 = Math.exp(r3);
>                      } else {
>                          r3 = Math.pow(a,b);
>                          if (b > a) {
>                              m2();
>                          }
>                          r2 = Math.exp(r3);
>                      }
>                  } else {
>                      r3 = Math.pow(a,b);
>                      r2 = Math.exp(r3);
>                  }
>              } else {
>                  r3 = Math.pow(a, b);
>                  r2 = Math.exp(r3);
>              }
>          }
>          r3 = Math.pow(a, b);
>          r2 = Math.exp(r3);
>          return r1 + r2 + r3;
>      }
>
>
> Roland.
>


More information about the hotspot-compiler-dev mailing list