SuperWordLoopUnrollAnalysis and loop unrolling
Yang Zhang
yang.zhang at linaro.org
Tue Apr 18 06:42:13 UTC 2017
Hi Andrew
I have run this test case.
For aarch64 C2 with SuperWordLoopUnrollAnalysis=false:
Loop unroll is still controlled by comparing body_size. When loop
unroll is 16 times, body_size is big enough and it stops.
For aarch64 C2 with SuperWordLoopUnrollAnalysis=true:
First loop unroll is controlled by policy_unroll_slp_analysis. When
loop unroll is 4 times, vectorization happens.
If vectorization succeeds, loop unroll would be instigated more by
set_major_progress until body_size is big enough.
if vectorization fails, loop unroll stops.
I haven't run the performance test. But I think in modern CPUs
instructions are run out of order. Loop unroll doesn't always bring
performance improvement.
Regards
Yang
On 15 April 2017 at 00:41, Andrew Haley <aph at redhat.com> wrote:
> On 14/04/17 10:21, Yang Zhang wrote:
>> My test result is just opposite with your description. Could you
>> provide your test case?
>
> // @Benchmark
> public int[] sameArrayClass(BenchmarkState state) {
> for (int i = 0; i < INITSIZE; i++) {
> state.b[0] = state.b[1];
> state.b[1] = state.b[2];
> state.b[2] = state.b[3];
> state.b[3] = state.b[0];
>
> state.b[0] = state.b[1];
> state.b[1] = state.b[2];
> state.b[2] = state.b[3];
> state.b[3] = state.b[0];
>
> state.b[0] = state.b[1];
> state.b[1] = state.b[2];
> state.b[2] = state.b[3];
> state.b[3] = state.b[0];
>
> state.b[0] = state.b[1];
> state.b[1] = state.b[2];
> state.b[2] = state.b[3];
> state.b[3] = state.b[0];
> }
> return state.b;
> }
>
> This is not vectorizable, but SuperWordLoopUnrollAnalysis=true disables
> the unrolling which would make it faster.
>
> Andrew.
>
More information about the hotspot-dev
mailing list