Vectorized Loop Unrolling on x64?

Nils Eliasson nils.eliasson at oracle.com
Tue Oct 24 16:59:44 UTC 2017


Hi,

Array reduction operations is implemented but are disabled in some 
settings.

See excellent blog post by Richard Startin: 
http://richardstartin.uk/tricking-java-into-adding-up-arrays-faster/

https://bugs.openjdk.java.net/browse/JDK-8188313

https://bugs.openjdk.java.net/browse/JDK-8078563

Regards,

Nils Eliasosn


On 2017-10-24 18:46, Ionut wrote:
> Hello All,
>
>    Meanwhile I tested two more other scenarios, as follows:
>
> - a[i] = b[i] + c[i]                    // where a, b, c are arrays of 
> ints
> - a[i] = a[i] + <int_value>      // where <int_value>might be a 
> constant, etc
>
> In both cases they were vectorized, but my initial example (e.g. 
> iterating through the array of ints and computing the sum of elements) 
> is not ... which makes me think this case is currently not supported 
> by JIT.
>
> Could you please confirm this?
>
> Regards
> Ionut
>
>
> On Tuesday, October 24, 2017 12:24 PM, Ionut <ionutb83 at yahoo.com> wrote:
>
>
> Hi Nils,
>
> Thanks, it is clear. However, I have tried a simple example (e.g. just 
> iterating through an array and do the sum using JMH) on my x64 Linux 
> and it seems to not be vectorized ...  Below initial source code and 
> assembly.
> Could you please provide me any hint, am I doing something wrong?
>
> *JDK is 9.0.1*
>
> *_Source code:_*
>
> @BenchmarkMode(Mode.AverageTime)
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> @Warmup(iterations = 10, time = 1, timeUnit = TimeUnit.NANOSECONDS)
> @Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.NANOSECONDS)
> @Fork(value = 3, jvmArgsAppend = { "-XX:-TieredCompilation", 
> "-Xbatch", "-XX:+UseSuperWord" })
> @State(Scope.Benchmark)
> public class _Sum1ToNArray _{
> private int[] array;
>
>     public static void main(String[] args) {
>         Options opt =
>   new OptionsBuilder()
> .include(Sum1ToNArray.class.getSimpleName())
>       .build();
> new Runner(opt).run();
>     }
>
> @Setup(Level.Trial)
>     public void setUp() {
> this.array = new int[100_000_000];
> for (int i = 0; i < array.length; i++)
>   array[i] = i + 1;
>     }
>
> @Benchmark
> public long hotMethod() {
> long sum = 0;
> for (int i = 0; i < array.length; i++) {
>   sum += array[i];
>         }
> return sum;
>     }
> }
>
> *_Assembly:_*
> ....[Hottest Region 
> 1]..............................................................................
> c2, com.jpt.Sum1ToNArray::hotMethod, version 139 (63 bytes)
>
>                            0x00007f7bf1bff0f9: mov    r8d,r10d
>                            0x00007f7bf1bff0fc: add    r8d,0xfffffff9
>                            0x00007f7bf1bff100: mov    r11d,0x1
>                            0x00007f7bf1bff106: cmp    r8d,0x1
>                    ╭    0x00007f7bf1bff10a: jg     0x00007f7bf1bff114
>                    │      0x00007f7bf1bff10c: mov    rax,rdx
>                    │╭   0x00007f7bf1bff10f: jmp    0x00007f7bf1bff15d
>                    ││↗  0x00007f7bf1bff111: mov    rdx,rax            
> ;*lload_1 {reexecute=0 rethrow=0 return_oop=0}
>                    │││  ; - com.jpt.Sum1ToNArray::hotMethod at 13 (line 53)
>                   ↘││  0x00007f7bf1bff114: movsxd rsi,DWORD PTR 
> [r14+r11*4+0x10]
>  11.08% 8.55%    ││  0x00007f7bf1bff119: movsxd rbp,DWORD PTR 
> [r14+r11*4+0x14]
>   0.30% 0.17%     ││  0x00007f7bf1bff11e: movsxd r13,DWORD PTR 
> [r14+r11*4+0x18]
>                     ││  0x00007f7bf1bff123: movsxd rax,DWORD PTR 
> [r14+r11*4+0x2c]
>   8.86% 2.85%     ││  0x00007f7bf1bff128: movsxd r9,DWORD PTR  
> [r14+r11*4+0x28]
>  10.49%  23.29%   ││  0x00007f7bf1bff12d: movsxd rcx,DWORD PTR 
> [r14+r11*4+0x24]
>   0.38% 0.45%     ││  0x00007f7bf1bff132: movsxd rbx,DWORD PTR 
> [r14+r11*4+0x20]
>   0.03% 0.06%     ││  0x00007f7bf1bff137: movsxd rdi,DWORD PTR 
> [r14+r11*4+0x1c]
>   0.23% 0.22%     ││  0x00007f7bf1bff13c: add rsi,rdx
>  10.58%  18.59%   ││  0x00007f7bf1bff13f: add rbp,rsi
>   0.32% 0.17%     ││  0x00007f7bf1bff142: add r13,rbp
>   0.05% 0.04%     ││  0x00007f7bf1bff145: add rdi,r13
>  26.10%  28.47%   ││  0x00007f7bf1bff148: add rbx,rdi
>   5.55% 5.48%     ││  0x00007f7bf1bff14b: add rcx,rbx
>   5.66% 1.32%     ││  0x00007f7bf1bff14e: add r9,rcx
>   7.85% 3.11%     ││  0x00007f7bf1bff151: add rax,r9            
>  ;*ladd {reexecute=0 rethrow=0 return_oop=0}
>                     ││                                              ; 
> - com.jpt.Sum1ToNArray::hotMethod at 21 (line 53)
>  10.19% 5.67%    ││  0x00007f7bf1bff154: add r11d,0x8         ;*iinc 
> {reexecute=0 rethrow=0 return_oop=0}
>                     ││ ; - com.jpt.Sum1ToNArray::hotMethod at 23 (line 52)
>   0.38% 0.12%     ││  0x00007f7bf1bff158: cmp r11d,r8d
>                     │╰  0x00007f7bf1bff15b: jl        
> 0x00007f7bf1bff111  ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0}
>                     │          ; - com.jpt.Sum1ToNArray::hotMethod at 10 
> (line 52)
>                     ↘   0x00007f7bf1bff15d: cmp    r11d,r10d
>                          0x00007f7bf1bff160: jge       0x00007f7bf1bff174
>                          0x00007f7bf1bff162: xchg    ax,ax            
>           ; *lload_1 {reexecute=0 rethrow=0 return_oop=0}
>         ; - com.jpt.Sum1ToNArray::hotMethod at 13 (line 53)
>                           0x00007f7bf1bff164: movsxd r8,DWORD PTR 
> [r14+r11*4+0x10]
>                            0x00007f7bf1bff169: add       rax,r8        
>             ;*ladd {reexecute=0 rethrow=0 return_oop=0}
>          ; - com.jpt.Sum1ToNArray::hotMethod at 21 (line 53)
>
> Regards
>
>
> On Tuesday, October 24, 2017 11:22 AM, Nils Eliasson 
> <nils.eliasson at oracle.com> wrote:
>
>
> Hi Ionut,
> In this case x86 refers to both x86_32/ia32 and x86_64/amd64/x64.
> Regards,
> Nils Eliasson
>
> On 2017-10-24 11:05, Ionut wrote:
>> Hello All,
>>
>>   I want to ask you about 
>> https://bugs.openjdk.java.net/browse/JDK-8129920* - Vectorized loop 
>> unrolling *which says it is applicable _only__for x86 targets_. Do 
>> you plan to port this for x64 as well? Or I miss something here?
>>
>> Regards
>> Ionut
>
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20171024/87ca0c7b/attachment-0001.html>


More information about the hotspot-compiler-dev mailing list