RFR: 8302652: [SuperWord] Reduction should happen after loop, when possible [v5]

Emanuel Peter epeter at openjdk.org
Fri May 12 06:47:52 UTC 2023


On Fri, 12 May 2023 01:10:29 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   use is_counted and is_innermost
>
> src/hotspot/share/opto/loopopts.cpp line 4210:
> 
>> 4208:           if (use != phi && ctrl_or_self(use) == cl) {
>> 4209:             DEBUG_ONLY( current->dump(-1); )
>> 4210:             assert(false, "reduction has use inside loop");
> 
> I have been wondering, it is right to bailout here from the optimization but why do we assert here? It is perfectly legal (if not very meaningful) to have a scalar use of the last unordered reduction within the loop. This will still auto vectorize as the reduction is to a scalar. e.g. a slight modification of the SumRed_Int.java still auto vectorizes and has a use of the last unordered reduction within the loop: 
>    public static int sumReductionImplement(
>             int[] a,
>             int[] b,
>             int[] c,
>             int total) {
>         int sum = 0;
>         for (int i = 0; i < a.length; i++) {
>             total += (a[i] * b[i]) + (a[i] * c[i]) + (b[i] * c[i]);
>             sum = total + i;
>         }
>         return total + sum;
>     }
> Do you think this is a valid concern?

I agree, the assert is not very necessary, but I'd rather have an assert more in there and figure out what cases I missed when the fuzzer eventually finds a case. But if it is wished I can also just remove that assert.

I wrote this `Test.java`:

class Test {
    static final int RANGE = 1024;
    static final int ITER  = 10_000;

    static void init(int[] data) {
        for (int i = 0; i < RANGE; i++) {
            data[i] = i + 1;
        }
    }

    static int test(int[] data, int sum) {
        int x = 0;
        for (int i = 0; i < RANGE; i++) {
            sum += 11 * data[i];
            x = sum & i; // what happens with this AndI ?
        }
        return sum + x;
    }

    public static void main(String[] args) {
        int[] data = new int[RANGE];
        init(data);
        for (int i = 0; i < ITER; i++) {
            test(data, i);
        }
    }
}

And ran it like this, with my patch:

./java -Xbatch -XX:CompileCommand=compileonly,Test::test  -XX:+TraceNewVectors -XX:+TraceSuperWord Test.java


Everything vectorized as usual. But what happens with the `AndI`? It actually drops outside the loop. Its left input is the `AddReductionVI`, and the right input is `(Phi #tripcount) + 63` (the last `i` thus already drops outside the loop).

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/13056#discussion_r1191971362


More information about the hotspot-compiler-dev mailing list