(Resend with description) Re: Request for review (XS): 6837146 Should perform unswitch before maximally unroll in loop transformation

Tom Rodriguez Thomas.Rodriguez at Sun.COM
Tue May 5 11:49:18 PDT 2009


I think your test case is going to be dominated by allocation since  
the work is so small.  Don't reallocate the array each time and make  
the unknown into a static field.  Maybe this:

class Test1 {

   static int unknown;

   public static void init(int src[], int dst[]) {
       // initialize the arrays
       for (int i =0; i<4; i++) {
            dst[i] = i;
           if(unknown >= 5)
            src[i] =  i;                  }
   }

   static int[] src = new int[4];
   static int[] dst = new int[4];

   public static void test() {
       init(src, dst);
   }

   public static void main(String[] args) {
       long start = System.currentTimeMillis();
       for (int i=0; i< 2000000; i++) {
           test();
       }
       long end = System.currentTimeMillis();
       System.out.println(end-start);
   }
}

The opto for that is kind of interesting because we end up unswitching  
and unrolling the 0 to 20000000 loop and produce code like this:

0e0   B10: #    B10 B11 &lt;- B9 B10    Loop: B10-B10 inner stride:  
not constant main of N249 Freq: 999988
0e0 +   STW    #0,[R_L4 + #12]
0e4 +   STW    R_L1,[R_L4 + #16]
0e8 +   STW    R_L3,[R_L4 + #20]
0ec +   STW    R_G4,[R_L4 + #24]
0f0 +   STW    R_L1,[R_L4 + #16]
0f4 +   STW    R_L3,[R_L4 + #20]
0f8 +   STW    R_G4,[R_L4 + #24]
....

Surprisingly we don't eliminate the redundant stores though I believe  
we have code does that.  The other unswitched loop is more interesting  
with a similar problem:

300   B32: #    B32 B33 &lt;- B31 B32   Loop: B32-B32 inner stride:  
not constant main of N220 Freq: 0.476827
300 +   STW    #0,[R_L4 + #12]
304 +   STW    #0,[R_L7 + #12]
308 +   STW    R_L1,[R_L4 + #16]
30c +   STW    R_L1,[R_L7 + #16]
310 +   STW    R_L3,[R_L4 + #20]
314 +   STW    R_L3,[R_L7 + #20]
318 +   STW    R_G4,[R_L4 + #24]
31c +   STW    R_G4,[R_L7 + #24]
320 +   STW    #0,[R_L4 + #12]
324 +   STW    #0,[R_L7 + #12]
328 +   STW    R_L1,[R_L4 + #16]
32c +   STW    R_L1,[R_L7 + #16]
330 +   STW    R_L3,[R_L4 + #20]
334 +   STW    R_L3,[R_L7 + #20]
338 +   STW    R_G4,[R_L4 + #24]
33c +   STW    R_G4,[R_L7 + #24]

We can't know that L4 and L7 point to different objects but since  
we're storing the same value we could in fact eliminate the later  
stores.  Neither of these pieces of code are that representative I  
think but they it's surprising we do badly on them.

I think the version of you test below might give you measurable results.

class Test1 {

   static int unknown;

   public static void init(int src[], int dst[]) {
       // initialize the arrays
       for (int i =0; i<4; i++) {
            dst[i] = i;
           if(unknown >= 5)
            src[i] =  i;                  }
   }

   static int[] src = new int[4];
   static int[] dst = new int[4];

   public static void test() {
       init(src, dst);
   }

   public static void main(String[] args) {
       long start = System.currentTimeMillis();
       for (int i=0; i< 2000000; i++) {
           unknown = i % 10;
           test();
       }
       long end = System.currentTimeMillis();
       System.out.println(end-start);
   }
}

tom

On May 5, 2009, at 11:23 AM, Changpeng Fang wrote:

>
> On 05/04/09 16:34, Tom Rodriguez wrote:
>> This seems ok.  Was there an example that motivated this change?
>>
> For the following Test1.java with -XX:CompileOnly=Test1.init, you  
> should see
> the loop is not unswitched. However, I could not observe perform  
> difference
> with my proposed change even though the loop is unswitched after the  
> change.
>
> Thanks,
>
> Changpeng
>
> class Test1 {
>
>   public static void init(int src[], int dst[], int unknown) {
>       // initialize the arrays
>       for (int i =0; i<4; i++) {
>            dst[i] = i;
>           if(unknown >= 5)
>            src[i] =  i;                  }
>   }
>
>   public static void test() {
>       int[] src = new int[4];
>       int[] dst = new int[4];
>
>       init(src, dst, 5);
>   }
>
>   public static void main(String[] args) {
>       long start = System.currentTimeMillis();
>       for (int i=0; i< 2000000; i++) {
>           test();
>       }
>       long end = System.currentTimeMillis();
>       System.out.println(end-start);
>   }
> }
>
>
>
>
>
>
>
>
>> On May 4, 2009, at 3:49 PM, Changpeng Fang wrote:
>>
>>> http://cr.openjdk.java.net/~cfang/6837146/webrev.00/
>>>
>>> Problem Summary:
>>> The concern is the ordering of maximally unroll and unswitch of a  
>>> loop in loop transformation.
>>> Current implementation is that maximally unroll is performed  
>>> before loop unswitch. The problem
>>> is if a loop is maximally unrolled (fully unrolled), it will never  
>>> be unswitched. This will leave
>>> many conditional statements (blocks) in the code which should not  
>>> appear if the loop can be
>>> unswitched.
>>>
>>> Proposed Solution:
>>> Change the ordering of unswitch and maximally_unroll in loop  
>>> transformation. After unswitch,
>>> the loop should still be able to be maximally unrolled.
>>>
>>> Thanks,
>>>
>>> Changpeng
>>>
>>>
>>>
>>> On 05/04/09 15:05, Changpeng Fang wrote:
>>>> http://cr.openjdk.java.net/~cfang/6837146/webrev.00/
>>>>
>>>> Summary: Change the ordering of do_unswitch and  
>>>> do_maximally_unroll in loop transformation.
>>>>
>>>> Thanks,
>>>>
>>>> Changpeng
>>>>
>>>
>>
>




More information about the hotspot-compiler-dev mailing list