RFR: 8302652: [SuperWord] Reduction should happen after loop, when possible [v5]

Emanuel Peter epeter at openjdk.org
Fri May 12 06:52:49 UTC 2023


On Fri, 12 May 2023 06:45:24 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> src/hotspot/share/opto/loopopts.cpp line 4210:
>> 
>>> 4208:           if (use != phi && ctrl_or_self(use) == cl) {
>>> 4209:             DEBUG_ONLY( current->dump(-1); )
>>> 4210:             assert(false, "reduction has use inside loop");
>> 
>> I have been wondering, it is right to bailout here from the optimization but why do we assert here? It is perfectly legal (if not very meaningful) to have a scalar use of the last unordered reduction within the loop. This will still auto vectorize as the reduction is to a scalar. e.g. a slight modification of the SumRed_Int.java still auto vectorizes and has a use of the last unordered reduction within the loop: 
>>    public static int sumReductionImplement(
>>             int[] a,
>>             int[] b,
>>             int[] c,
>>             int total) {
>>         int sum = 0;
>>         for (int i = 0; i < a.length; i++) {
>>             total += (a[i] * b[i]) + (a[i] * c[i]) + (b[i] * c[i]);
>>             sum = total + i;
>>         }
>>         return total + sum;
>>     }
>> Do you think this is a valid concern?
>
> I agree, the assert is not very necessary, but I'd rather have an assert more in there and figure out what cases I missed when the fuzzer eventually finds a case. But if it is wished I can also just remove that assert.
> 
> I wrote this `Test.java`:
> 
> class Test {
>     static final int RANGE = 1024;
>     static final int ITER  = 10_000;
> 
>     static void init(int[] data) {
>         for (int i = 0; i < RANGE; i++) {
>             data[i] = i + 1;
>         }
>     }
> 
>     static int test(int[] data, int sum) {
>         int x = 0;
>         for (int i = 0; i < RANGE; i++) {
>             sum += 11 * data[i];
>             x = sum & i; // what happens with this AndI ?
>         }
>         return sum + x;
>     }
> 
>     public static void main(String[] args) {
>         int[] data = new int[RANGE];
>         init(data);
>         for (int i = 0; i < ITER; i++) {
>             test(data, i);
>         }
>     }
> }
> 
> And ran it like this, with my patch:
> 
> ./java -Xbatch -XX:CompileCommand=compileonly,Test::test  -XX:+TraceNewVectors -XX:+TraceSuperWord Test.java
> 
> 
> Everything vectorized as usual. But what happens with the `AndI`? It actually drops outside the loop. Its left input is the `AddReductionVI`, and the right input is `(Phi #tripcount) + 63` (the last `i` thus already drops outside the loop).

Note: If I have uses of the reduction in each iteration, then we already refuse to vectorize the reduction, as in this case:

    static int test(int[] data, int sum) {
        int x = 0;
        for (int i = 0; i < RANGE; i++) {
            sum += 11 * data[i];
            x += sum & i;  // vector use of sum prevents vectorization of sum's reduction-vectorization -> whole chain not vectorized
        }
        return sum + x;
    }

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/13056#discussion_r1191973738


More information about the hotspot-compiler-dev mailing list