IntVector.fromValues is not optimized away ?
Paul Sandoz
paul.sandoz at oracle.com
Mon May 11 19:42:16 UTC 2020
Swings and roundabouts.
Unsurprisingly, a significant proportion of instructions involve shuffling field values into temporary buffers from which vector loads are performed.
The current code and my patch result in a similar set of instructions but my patch is not as efficient because of a less optimal use of a vector instruction:
vpxor %xmm0,%xmm1,%xmm0
vs.
vpxor 0x10(%r10),%xmm0,%xmm0
HS could be smarter about gathering field values and eliding the intermediate var arg arrays for common layouts e.g. leverage the gather functionality. But, in general, the vector load instructions prefer values linearly laid out in memory.
My recommendation would be to use fromValues for constant or pre-computed vector values.
Paul.
> On May 11, 2020, at 11:02 AM, Paul Sandoz <paul.sandoz at oracle.com> wrote:
>
> Hi Remi,
>
> For some reason this method does not defer to the fromArray equivalent.
>
> Can you try with the following patch?
>
> http://cr.openjdk.java.net/~psandoz/panama/vector-from-values-using-from-array/webrev/ <http://cr.openjdk.java.net/~psandoz/panama/vector-from-values-using-from-array/webrev/>
>
> I shall also investigate further.
>
> Paul.
>
>> On May 9, 2020, at 11:52 AM, Remi Forax <forax at univ-mlv.fr> wrote:
>>
>> Hi all,
>> this may be obvious but do we agree that IntVector.fromValues is not optimized thus really create an array destroying any hope of perf ?
>>
>> I'm trying to see the difference between
>>
>> public int hashCode() {
>> return i1 ^ i2 ^ i3 ^ i4;
>> }
>>
>> and
>>
>> public int hashCode() {
>> var v1 = IntVector.fromValues(IntVector.SPECIES_64, i1, i3);
>> var v2 = IntVector.fromValues(IntVector.SPECIES_64, i2, i4);
>> var result = v1.lanewise(VectorOperators.XOR, v2);
>> return result.lane(0) ^ result.lane(1);
>> }
>>
>> but taking a look to the generated assembly (below), the allocation of the two arrays are still there,
>> too bad because the last 6 instructions are more or less what i was expecting.
>>
>>
>> 0x00007fbb383324dc: mov 0x14(%rsi),%r11d ;*getfield i3 {reexecute=0 rethrow=0 return_oop=0}
>> ; - fr.umlv.vector.VectorizedHashCode$Data::hashCode2 at 16 (line 14)
>> 0x00007fbb383324e0: mov 0xc(%rsi),%ebp
>> 0x00007fbb383324e3: mov 0x120(%r15),%r8
>> 0x00007fbb383324ea: mov %r8,%r10
>> 0x00007fbb383324ed: add $0x18,%r10
>> 0x00007fbb383324f1: cmp 0x130(%r15),%r10
>> 0x00007fbb383324f8: jae 0x00007fbb383325db
>> 0x00007fbb383324fe: mov %r10,0x120(%r15)
>> 0x00007fbb38332505: prefetchw 0xc0(%r10)
>> 0x00007fbb3833250d: movq $0x1,(%r8)
>> 0x00007fbb38332514: prefetchw 0x100(%r10)
>> 0x00007fbb3833251c: movl $0x70cb1,0x8(%r8) ; {metadata({type array int})}
>> 0x00007fbb38332524: prefetchw 0x140(%r10)
>> 0x00007fbb3833252c: movl $0x2,0xc(%r8)
>> 0x00007fbb38332534: prefetchw 0x180(%r10)
>> 0x00007fbb3833253c: mov %ebp,0x10(%r8)
>> 0x00007fbb38332540: mov %r11d,0x14(%r8) ;*newarray {reexecute=0 rethrow=0 return_oop=0}
>> ; - java.util.Arrays::copyOf at 1 (line 3584)
>> ; - jdk.incubator.vector.IntVector::fromValues at 19 (line 553)
>> ; - fr.umlv.vector.VectorizedHashCode$Data::hashCode2 at 20 (line 14)
>> 0x00007fbb38332544: mov 0x18(%rsi),%r9d
>> 0x00007fbb38332548: mov 0x120(%r15),%rax ;*invokestatic extract {reexecute=0 rethrow=0 return_oop=0}
>> ; - jdk.incubator.vector.Int64Vector::laneHelper at 16 (line 482)
>> ; - jdk.incubator.vector.Int64Vector::lane at 36 (line 476)
>> ; - fr.umlv.vector.VectorizedHashCode$Data::hashCode2 at 64 (line 17)
>> 0x00007fbb3833254f: mov 0x10(%rsi),%ebp ;*getfield i2 {reexecute=0 rethrow=0 return_oop=0}
>> ; - fr.umlv.vector.VectorizedHashCode$Data::hashCode2 at 33 (line 15)
>> 0x00007fbb38332552: mov %rax,%r10
>> 0x00007fbb38332555: add $0x18,%r10
>> 0x00007fbb38332559: nopl 0x0(%rax)
>> 0x00007fbb38332560: cmp 0x130(%r15),%r10
>> 0x00007fbb38332567: jae 0x00007fbb3833260d
>> 0x00007fbb3833256d: mov %r10,0x120(%r15)
>> 0x00007fbb38332574: prefetchw 0xc0(%r10)
>> 0x00007fbb3833257c: movq $0x1,(%rax)
>> 0x00007fbb38332583: prefetchw 0x100(%r10)
>> 0x00007fbb3833258b: movl $0x70cb1,0x8(%rax) ; {metadata({type array int})}
>> 0x00007fbb38332592: prefetchw 0x140(%r10)
>> 0x00007fbb3833259a: movl $0x2,0xc(%rax)
>> 0x00007fbb383325a1: prefetchw 0x180(%r10)
>> 0x00007fbb383325a9: mov %ebp,0x10(%rax)
>> 0x00007fbb383325ac: mov %r9d,0x14(%rax) ;*newarray {reexecute=0 rethrow=0 return_oop=0}
>> ; - java.util.Arrays::copyOf at 1 (line 3584)
>> ; - jdk.incubator.vector.IntVector::fromValues at 19 (line 553)
>> ; - fr.umlv.vector.VectorizedHashCode$Data::hashCode2 at 44 (line 15)
>> 0x00007fbb383325b0: vmovq 0x10(%rax),%xmm0 ;*invokestatic extract {reexecute=0 rethrow=0 return_oop=0}
>> ; - jdk.incubator.vector.Int64Vector::laneHelper at 16 (line 482)
>> ; - jdk.incubator.vector.Int64Vector::lane at 36 (line 476)
>> ; - fr.umlv.vector.VectorizedHashCode$Data::hashCode2 at 64 (line 17)
>> 0x00007fbb383325b5: vpxor 0x10(%r8),%xmm0,%xmm0 ;*invokespecial <init> {reexecute=0 rethrow=0 return_oop=0}
>> ; - jdk.internal.vm.vector.VectorSupport$Vector::<init>@2 (line 104)
>> ; - jdk.incubator.vector.Vector::<init>@2 (line 1122)
>> ; - jdk.incubator.vector.AbstractVector::<init>@2 (line 67)
>> ; - jdk.incubator.vector.IntVector::<init>@2 (line 55)
>> ; - jdk.incubator.vector.Int64Vector::<init>@2 (line 58)
>> ; - jdk.incubator.vector.Int64Vector::vectorFactory at 5 (line 169)
>> ; - jdk.incubator.vector.Int64Vector::vectorFactory at 2 (line 41)
>> ; - jdk.incubator.vector.IntVector$IntSpecies::vectorFactory at 5 (line 3718)
>> ; - jdk.incubator.vector.IntVector::fromValues at 22 (line 553)
>> ; - fr.umlv.vector.VectorizedHashCode$Data::hashCode2 at 44 (line 15)
>> 0x00007fbb383325bb: vpextrd $0x1,%xmm0,%r11d
>> 0x00007fbb383325c1: vmovd %xmm0,%eax
>> 0x00007fbb383325c5: xor %r11d,%eax
>> 0x00007fbb383325c8: vzeroupper
>>
>> regards,
>> Rémi
>
More information about the panama-dev
mailing list