Vectorized Loop Unrolling on x64?

Ionut ionutb83 at yahoo.com
Tue Oct 24 16:46:17 UTC 2017


Hello All,
   Meanwhile I tested two more other scenarios, as follows:
- a[i] = b[i] + c[i]                    // where a, b, c are arrays of ints- a[i] = a[i] + <int_value>      // where <int_value>might be a constant, etc
In both cases they were vectorized, but my initial example (e.g. iterating through the array of ints and computing the sum of elements) is not ... which makes me think this case is currently not supported by JIT.
Could you please confirm this?
RegardsIonut 

    On Tuesday, October 24, 2017 12:24 PM, Ionut <ionutb83 at yahoo.com> wrote:
 

 Hi Nils,
  Thanks, it is clear. However, I have tried a simple example (e.g.  just iterating through an array and do the sum using JMH) on my x64 Linux and it seems to not be vectorized ...  Below initial source code and assembly. Could you please provide me any hint, am I doing something wrong?
JDK is 9.0.1
Source code:
@BenchmarkMode(Mode.AverageTime)@OutputTimeUnit(TimeUnit.NANOSECONDS)@Warmup(iterations = 10, time = 1, timeUnit = TimeUnit.NANOSECONDS)@Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.NANOSECONDS)@Fork(value = 3, jvmArgsAppend = { "-XX:-TieredCompilation", "-Xbatch", "-XX:+UseSuperWord" })@State(Scope.Benchmark)public class Sum1ToNArray {    private int[] array;
    public static void main(String[] args) {        Options opt =
            new OptionsBuilder()                .include(Sum1ToNArray.class.getSimpleName())                .build();        new Runner(opt).run();    }
    @Setup(Level.Trial)    public void setUp() {        this.array = new int[100_000_000];        for (int i = 0; i < array.length; i++)            array[i] = i + 1;    }
    @Benchmark    public long hotMethod() {
        long sum = 0;        for (int i = 0; i < array.length; i++) {            sum += array[i];        }        return sum;    }}
Assembly:....[Hottest Region 1]..............................................................................c2, com.jpt.Sum1ToNArray::hotMethod, version 139 (63 bytes) 
                                     0x00007f7bf1bff0f9: mov    r8d,r10d                                     0x00007f7bf1bff0fc: add    r8d,0xfffffff9                                     0x00007f7bf1bff100: mov    r11d,0x1                                     0x00007f7bf1bff106: cmp    r8d,0x1                             ╭    0x00007f7bf1bff10a: jg     0x00007f7bf1bff114                             │      0x00007f7bf1bff10c: mov    rax,rdx                             │╭   0x00007f7bf1bff10f: jmp    0x00007f7bf1bff15d                             ││↗  0x00007f7bf1bff111: mov    rdx,rax            ;*lload_1 {reexecute=0 rethrow=0 return_oop=0}                             │││                                                                       ; - com.jpt.Sum1ToNArray::hotMethod at 13 (line 53)                            ↘││  0x00007f7bf1bff114: movsxd rsi,DWORD PTR [r14+r11*4+0x10] 11.08%    8.55%    ││  0x00007f7bf1bff119: movsxd rbp,DWORD PTR [r14+r11*4+0x14]  0.30%    0.17%     ││  0x00007f7bf1bff11e: movsxd r13,DWORD PTR [r14+r11*4+0x18]                              ││  0x00007f7bf1bff123: movsxd rax,DWORD PTR [r14+r11*4+0x2c]  8.86%    2.85%     ││  0x00007f7bf1bff128: movsxd r9,DWORD PTR  [r14+r11*4+0x28] 10.49%   23.29%   ││  0x00007f7bf1bff12d: movsxd rcx,DWORD PTR [r14+r11*4+0x24]  0.38%    0.45%     ││  0x00007f7bf1bff132: movsxd rbx,DWORD PTR [r14+r11*4+0x20]  0.03%    0.06%     ││  0x00007f7bf1bff137: movsxd rdi,DWORD PTR [r14+r11*4+0x1c]  0.23%    0.22%     ││  0x00007f7bf1bff13c: add    rsi,rdx 10.58%   18.59%   ││  0x00007f7bf1bff13f: add    rbp,rsi  0.32%    0.17%     ││  0x00007f7bf1bff142: add    r13,rbp  0.05%    0.04%     ││  0x00007f7bf1bff145: add    rdi,r13 26.10%   28.47%   ││  0x00007f7bf1bff148: add    rbx,rdi  5.55%    5.48%     ││  0x00007f7bf1bff14b: add    rcx,rbx  5.66%    1.32%     ││  0x00007f7bf1bff14e: add    r9,rcx  7.85%    3.11%     ││  0x00007f7bf1bff151: add    rax,r9             ;*ladd {reexecute=0 rethrow=0 return_oop=0}                              ││                                                                     ; - com.jpt.Sum1ToNArray::hotMethod at 21 (line 53) 10.19%    5.67%    ││  0x00007f7bf1bff154: add    r11d,0x8         ;*iinc {reexecute=0 rethrow=0 return_oop=0}                              ││                                                                      ; - com.jpt.Sum1ToNArray::hotMethod at 23 (line 52)  0.38%    0.12%     ││  0x00007f7bf1bff158: cmp    r11d,r8d                              │╰  0x00007f7bf1bff15b: jl        0x00007f7bf1bff111  ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0}                              │                                                                                 ; - com.jpt.Sum1ToNArray::hotMethod at 10 (line 52)                              ↘   0x00007f7bf1bff15d: cmp    r11d,r10d                                   0x00007f7bf1bff160: jge       0x00007f7bf1bff174                                   0x00007f7bf1bff162: xchg    ax,ax                      ; *lload_1 {reexecute=0 rethrow=0 return_oop=0}                                                                                                              ; - com.jpt.Sum1ToNArray::hotMethod at 13 (line 53)                                    0x00007f7bf1bff164: movsxd r8,DWORD PTR [r14+r11*4+0x10]                                     0x00007f7bf1bff169: add       rax,r8                    ;*ladd {reexecute=0 rethrow=0 return_oop=0}                                                                                                               ; - com.jpt.Sum1ToNArray::hotMethod at 21 (line 53)
Regards 

    On Tuesday, October 24, 2017 11:22 AM, Nils Eliasson <nils.eliasson at oracle.com> wrote:
 

  Hi Ionut, In this case x86 refers to both x86_32/ia32 and x86_64/amd64/x64. Regards, Nils Eliasson
  
 On 2017-10-24 11:05, Ionut wrote:
  
 Hello All, 
      I want to ask you about https://bugs.openjdk.java.net/browse/JDK-8129920 - Vectorized loop unrolling which says it is applicable only for x86 targets. Do you plan to port this for x64 as well? Or I miss something here? 
  Regards Ionut 
 
 

   

   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20171024/cdcfaca1/attachment-0001.html>


More information about the hotspot-compiler-dev mailing list