RFR (14) 8235837: Memory access API refinements

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Tue Jan 21 11:34:47 UTC 2020


Copied and pasted your benchmark - still see vectorization (on both 
segment and buffer loop):

  0.67%  ??  0x00002b2ea48a8e4e:   vmovdqu %xmm2,0x0(%r13)
  13.55%  ??  0x00002b2ea48a8e54:   vmovdqu %xmm2,0x10(%r13)
   0.40%  ??  0x00002b2ea48a8e5a:   vmovdqu %xmm2,0x20(%r13)
   2.33%  ??  0x00002b2ea48a8e60:   vmovdqu %xmm2,0x30(%r13)
   8.32%  ??  0x00002b2ea48a8e66:   vmovdqu %xmm2,0x40(%r13)
   7.13%  ??  0x00002b2ea48a8e6c:   vmovdqu %xmm2,0x50(%r13)
   0.31%  ??  0x00002b2ea48a8e72:   vmovdqu %xmm2,0x60(%r13)
   1.29%  ??  0x00002b2ea48a8e78:   vmovdqu %xmm2,0x70(%r13)
  11.96%  ??  0x00002b2ea48a8e7e:   vmovdqu %xmm2,0x80(%r13)
   7.36%  ??  0x00002b2ea48a8e87:   vmovdqu %xmm2,0x90(%r13)
   0.77%  ??  0x00002b2ea48a8e90:   vmovdqu %xmm2,0xa0(%r13)
   2.23%  ??  0x00002b2ea48a8e99:   vmovdqu %xmm2,0xb0(%r13)
  12.01%  ??  0x00002b2ea48a8ea2:   vmovdqu %xmm2,0xc0(%r13)
   7.55%  ??  0x00002b2ea48a8eab:   vmovdqu %xmm2,0xd0(%r13)
   0.33%  ??  0x00002b2ea48a8eb4:   vmovdqu %xmm2,0xe0(%r13)
   1.44%  ??  0x00002b2ea48a8ebd:   vmovdqu %xmm2,0xf0(%r13)            
;*invokevirtual putIntUnaligned {reexecute=0 rethrow=0 return_oop=0}
??                                                            ; - 
jdk.internal.misc.Unsafe::putIntUnaligned at 10 (line 3693)
??                                                            ; - 
java.lang.invoke.VarHandleMemoryAddressAsInts::set0 at 38 (line 86)
??                                                            ; - 
java.lang.invoke.VarHandleMemoryAddressAsInts1/0x0000000800bc8440::set at 42
??                                                            ; - 
java.lang.invoke.VarHandleGuards::guard_LJI_V at 38 (line 952)
??                                                            ; - 
org.openjdk.bench.jdk.incubator.foreign.LoopOverNew::segment_loop at 36 
(line 81)
   3.23%  ??  0x00002b2ea48a8ec6:   add $0x40,%edx                   
;*iinc {reexecute=0 rethrow=0 return_oop=0}
??                                                            ; - 
org.openjdk.bench.jdk.incubator.foreign.LoopOverNew::segment_loop at 39 
(line 80)

And these are the benchmark results.

Benchmark                 Mode  Cnt  Score   Error  Units
LoopOverNew.buffer_loop   avgt   30  0.202 ? 0.001  ms/op
LoopOverNew.segment_loop  avgt   30  0.202 ? 0.005  ms/op
LoopOverNew.unsafe_loop   avgt   30  0.398 ? 0.005  ms/op

As you observed previously, I do note that plain Unsafe is failing to 
vectorize, and I get this instead:

0.04%  ??  0x00002b75b8aa5efb:   movl   $0x4,0x4(%rbx)
   8.47%  ??  0x00002b75b8aa5f02:   movl   $0x4,0x8(%rbx)
  11.11%  ??  0x00002b75b8aa5f09:   movl   $0x4,0xc(%rbx)
   0.29%  ??  0x00002b75b8aa5f10:   movl   $0x4,0x10(%rbx)
   5.05%  ??  0x00002b75b8aa5f17:   movl   $0x4,0x14(%rbx)
   4.41%  ??  0x00002b75b8aa5f1e:   movl   $0x4,0x18(%rbx)
   5.37%  ??  0x00002b75b8aa5f25:   movl   $0x4,0x1c(%rbx)
   1.89%  ??  0x00002b75b8aa5f2c:   movl   $0x4,0x20(%rbx)
   4.93%  ??  0x00002b75b8aa5f33:   movl   $0x4,0x24(%rbx)
  10.53%  ??  0x00002b75b8aa5f3a:   movl   $0x4,0x28(%rbx)
   6.99%  ??  0x00002b75b8aa5f41:   movl   $0x4,0x2c(%rbx)
   2.33%  ??  0x00002b75b8aa5f48:   movl   $0x4,0x30(%rbx)
   4.76%  ??  0x00002b75b8aa5f4f:   movl   $0x4,0x34(%rbx)
   6.39%  ??  0x00002b75b8aa5f56:   movl   $0x4,0x38(%rbx)
   4.49%  ??  0x00002b75b8aa5f5d:   movl $0x4,0x3c(%rbx)              
;*invokevirtual putInt {reexecute=0 rethrow=0 return_oop=0}
??                                                            ; - 
jdk.internal.misc.Unsafe::putInt at 4 (line 370)
??                                                            ; - 
sun.misc.Unsafe::putInt at 5 (line 358)
??                                                            ; - 
org.openjdk.bench.jdk.incubator.foreign.LoopOverNew::unsafe_loop at 42 
(line 71)

I'll defer definitive judgement to one of our C2 gurus :-)

Maurizio

On 21/01/2020 10:06, Andrew Haley wrote:
> On 1/17/20 4:22 PM, Maurizio Cimadamore wrote:
>> Perhaps we tweaked the benchmarks in slightly different ways? Can you
>> please share your modifications, so that at least we can make sure we're
>> running the same thing.
> Sorry, I missed this.
>


More information about the panama-dev mailing list