Observations from a simple JMH benchmark
Paul Sandoz
paul.sandoz at oracle.com
Thu Feb 15 20:32:05 UTC 2018
Better! still some artifacts:
0.05% ↗ 0x000000011ee23a72: mov %eax,%r9d ;*checkcast {reexecute=0 rethrow=0 return_oop=0}
│ ; - jdk.incubator.vector.Int128Vector$Int128Species::fromArray at 33 (line 673)
│ ; - jdk.incubator.vector.Int128Vector$Int128Species::fromArray at 3 (line 564)
│ ; - jmh.AddTest::add at 31 (line 53)
│ ; - jmh.generated.AddTest_add_jmhTest::add_avgt_jmhStub at 17 (line 186)
25.26% │ ↗ 0x000000011ee23a75: vmovdqu 0x10(%rdx,%r9,4),%xmm0 ;*invokedynamic {reexecute=0 rethrow=0 return_oop=0}
│ │ ; - jdk.incubator.vector.Int128Vector$Int128Species::fromArray at 25 (line 673)
│ │ ; - jdk.incubator.vector.Int128Vector$Int128Species::fromArray at 3 (line 564)
│ │ ; - jmh.AddTest::add at 19 (line 52)
│ │ ; - jmh.generated.AddTest_add_jmhTest::add_avgt_jmhStub at 17 (line 186)
2.67% │ │ 0x000000011ee23a7c: cmp %r11d,%r9d
╭│ │ 0x000000011ee23a7f: jae 0x000000011ee23b8c ;*invokestatic store {reexecute=0 rethrow=0 return_oop=0}
││ │ ; - jdk.incubator.vector.Int128Vector::intoArray at 30 (line 250)
││ │ ; - jmh.AddTest::add at 45 (line 54)
││ │ ; - jmh.generated.AddTest_add_jmhTest::add_avgt_jmhStub at 17 (line 186)
3.80% ││ │ 0x000000011ee23a85: vpaddd 0x10(%rsi,%r9,4),%xmm0,%xmm0
││ │ ;*invokestatic checkIndex {reexecute=0 rethrow=0 return_oop=0}
││ │ ; - java.util.Objects::checkIndex at 3 (line 372)
││ │ ; - jdk.incubator.vector.VectorIntrinsics::checkIndex at 45 (line 126)
││ │ ; - jdk.incubator.vector.Int128Vector$Int128Species::fromArray at 11 (line 672)
││ │ ; - jdk.incubator.vector.Int128Vector$Int128Species::fromArray at 3 (line 564)
││ │ ; - jmh.AddTest::add at 19 (line 52)
││ │ ; - jmh.generated.AddTest_add_jmhTest::add_avgt_jmhStub at 17 (line 186)
33.52% ││ │ 0x000000011ee23a8c: cmp %ecx,%r9d
││ │ 0x000000011ee23a8f: jae 0x000000011ee23bb2 ;*invokestatic store {reexecute=0 rethrow=0 return_oop=0}
││ │ ; - jdk.incubator.vector.Int128Vector::intoArray at 30 (line 250)
││ │ ; - jmh.AddTest::add at 45 (line 54)
││ │ ; - jmh.generated.AddTest_add_jmhTest::add_avgt_jmhStub at 17 (line 186)
0.10% ││ │ 0x000000011ee23a95: vmovdqu %xmm0,0x10(%rbx,%r9,4) ;*invokestatic linkToStatic {reexecute=0 rethrow=0 return_oop=0}
││ │ ; - java.lang.invoke.DirectMethodHandle$Holder::invokeStatic at 10
││ │ ; - java.lang.invoke.LambdaForm$MH/787867107::linkToTargetMethod at 5
││ │ ; - jdk.incubator.vector.Int128Vector$Int128Species::fromArray at 25 (line 673)
││ │ ; - jdk.incubator.vector.Int128Vector$Int128Species::fromArray at 3 (line 564)
││ │ ; - jmh.AddTest::add at 19 (line 52)
││ │ ; - jmh.generated.AddTest_add_jmhTest::add_avgt_jmhStub at 17 (line 186)
25.34% ││ │ 0x000000011ee23a9c: add $0x4,%eax
3.70% ││ │ 0x000000011ee23a9f: cmp %edi,%eax
│╰ │ 0x000000011ee23aa1: jl 0x000000011ee23a72 ;*synchronization entry
I think you can drop the first index check for case 2. IIUC you are assuming constraints vlen > 0 and length >= 0, so for:
Objects.checkIndex(ix, length - (vlen - 1));
the check will fail if length - (vlen - 1) < 0.
However, curiously, the single check makes no difference, the same code is generated.
Paul.
> On Feb 15, 2018, at 10:44 AM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
>
> Here's a quick prototype of enhanced array OOB checks:
>
> http://cr.openjdk.java.net/~vlivanov/panama/vector.oob/webrev.00
>
> -Djdk.incubator.vector.VECTOR_ACCESS_OOB_CHECK:
>
> =0: no checks
>
> =1: use Objects.checkFromIndexSize (default)
>
> =2: enhanced check
> + // vlen > 0, length >=0
> + //
> + // 0 <= i, (i + vlen) <= length
> + // =>
> + // 0 <= vlen <= length, 0 <= i < (length - vlen);
> + Objects.checkIndex(vlen - 1, length);
> + Objects.checkIndex(ix, length - (vlen - 1));
>
> Best regards,
> Vladimir Ivanov
>
> On 2/15/18 3:58 PM, Vladimir Ivanov wrote:
>>> I have been playing around with a simple benchmark and JMH (separately i can get asm hotspots working on the mac now via dtrace!)
>> ...
>>> Vladimir, i guess this is the kind of thing you were mentioning with regards to bounds checks?
>> Yes.
>>> Perhaps there are general optimization possibilities for such bounds checks. Only Preconditions.checkIndex is currently an intrinsic.
>> I proposed some tweaks earlier [1], but haven't done any experiments yet:
>> 0 <= i
>> (i + vlen) <= length
>> ==>
>> (a) 0 <= i <= (length - vlen)
>> (b) (vlen <= length)
>> a - upper bound is loop invariant
>> b - loop invariant
>> Best regards,
>> Vladimir Ivanov
>> [1] http://mail.openjdk.java.net/pipermail/panama-dev/2017-December/000889.html
More information about the panama-dev
mailing list