IntVector.fromValues is not optimized away ?
forax at univ-mlv.fr
forax at univ-mlv.fr
Mon May 11 19:59:51 UTC 2020
----- Mail original -----
> De: "Paul Sandoz" <paul.sandoz at oracle.com>
> À: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "panama-dev at openjdk.java.net'" <panama-dev at openjdk.java.net>
> Envoyé: Lundi 11 Mai 2020 21:42:16
> Objet: Re: IntVector.fromValues is not optimized away ?
> Swings and roundabouts.
>
> Unsurprisingly, a significant proportion of instructions involve shuffling field
> values into temporary buffers from which vector loads are performed.
>
> The current code and my patch result in a similar set of instructions but my
> patch is not as efficient because of a less optimal use of a vector
> instruction:
>
> vpxor %xmm0,%xmm1,%xmm0
>
> vs.
>
> vpxor 0x10(%r10),%xmm0,%xmm0
>
>
> HS could be smarter about gathering field values and eliding the intermediate
> var arg arrays for common layouts e.g. leverage the gather functionality. But,
> in general, the vector load instructions prefer values linearly laid out in
> memory.
>
> My recommendation would be to use fromValues for constant or pre-computed vector
> values.
and what i should use for fields ?
Adding pattern matching rules for that in HS can not be done ?
vmovq + vpxor => vpxor
vpinsrd + vpxor => vpxor,
etc
>
> Paul.
>
Rémi
>
>> On May 11, 2020, at 11:02 AM, Paul Sandoz <paul.sandoz at oracle.com> wrote:
>>
>> Hi Remi,
>>
>> For some reason this method does not defer to the fromArray equivalent.
>>
>> Can you try with the following patch?
>>
>> http://cr.openjdk.java.net/~psandoz/panama/vector-from-values-using-from-array/webrev/
>> <http://cr.openjdk.java.net/~psandoz/panama/vector-from-values-using-from-array/webrev/>
>>
>> I shall also investigate further.
>>
>> Paul.
>>
>>> On May 9, 2020, at 11:52 AM, Remi Forax <forax at univ-mlv.fr> wrote:
>>>
>>> Hi all,
>>> this may be obvious but do we agree that IntVector.fromValues is not optimized
>>> thus really create an array destroying any hope of perf ?
>>>
>>> I'm trying to see the difference between
>>>
>>> public int hashCode() {
>>> return i1 ^ i2 ^ i3 ^ i4;
>>> }
>>>
>>> and
>>>
>>> public int hashCode() {
>>> var v1 = IntVector.fromValues(IntVector.SPECIES_64, i1, i3);
>>> var v2 = IntVector.fromValues(IntVector.SPECIES_64, i2, i4);
>>> var result = v1.lanewise(VectorOperators.XOR, v2);
>>> return result.lane(0) ^ result.lane(1);
>>> }
>>>
>>> but taking a look to the generated assembly (below), the allocation of the two
>>> arrays are still there,
>>> too bad because the last 6 instructions are more or less what i was expecting.
>>>
>>>
>>> 0x00007fbb383324dc: mov 0x14(%rsi),%r11d ;*getfield i3
>>> {reexecute=0 rethrow=0 return_oop=0}
>>> ; - fr.umlv.vector.VectorizedHashCode$Data::hashCode2 at 16 (line 14)
>>> 0x00007fbb383324e0: mov 0xc(%rsi),%ebp
>>> 0x00007fbb383324e3: mov 0x120(%r15),%r8
>>> 0x00007fbb383324ea: mov %r8,%r10
>>> 0x00007fbb383324ed: add $0x18,%r10
>>> 0x00007fbb383324f1: cmp 0x130(%r15),%r10
>>> 0x00007fbb383324f8: jae 0x00007fbb383325db
>>> 0x00007fbb383324fe: mov %r10,0x120(%r15)
>>> 0x00007fbb38332505: prefetchw 0xc0(%r10)
>>> 0x00007fbb3833250d: movq $0x1,(%r8)
>>> 0x00007fbb38332514: prefetchw 0x100(%r10)
>>> 0x00007fbb3833251c: movl $0x70cb1,0x8(%r8) ; {metadata({type
>>> array int})}
>>> 0x00007fbb38332524: prefetchw 0x140(%r10)
>>> 0x00007fbb3833252c: movl $0x2,0xc(%r8)
>>> 0x00007fbb38332534: prefetchw 0x180(%r10)
>>> 0x00007fbb3833253c: mov %ebp,0x10(%r8)
>>> 0x00007fbb38332540: mov %r11d,0x14(%r8) ;*newarray
>>> {reexecute=0 rethrow=0 return_oop=0}
>>> ; - java.util.Arrays::copyOf at 1 (line 3584)
>>> ; - jdk.incubator.vector.IntVector::fromValues at 19 (line 553)
>>> ; - fr.umlv.vector.VectorizedHashCode$Data::hashCode2 at 20 (line 14)
>>> 0x00007fbb38332544: mov 0x18(%rsi),%r9d
>>> 0x00007fbb38332548: mov 0x120(%r15),%rax ;*invokestatic extract
>>> {reexecute=0 rethrow=0 return_oop=0}
>>> ; - jdk.incubator.vector.Int64Vector::laneHelper at 16 (line 482)
>>> ; - jdk.incubator.vector.Int64Vector::lane at 36 (line 476)
>>> ; - fr.umlv.vector.VectorizedHashCode$Data::hashCode2 at 64 (line 17)
>>> 0x00007fbb3833254f: mov 0x10(%rsi),%ebp ;*getfield i2
>>> {reexecute=0 rethrow=0 return_oop=0}
>>> ; - fr.umlv.vector.VectorizedHashCode$Data::hashCode2 at 33 (line 15)
>>> 0x00007fbb38332552: mov %rax,%r10
>>> 0x00007fbb38332555: add $0x18,%r10
>>> 0x00007fbb38332559: nopl 0x0(%rax)
>>> 0x00007fbb38332560: cmp 0x130(%r15),%r10
>>> 0x00007fbb38332567: jae 0x00007fbb3833260d
>>> 0x00007fbb3833256d: mov %r10,0x120(%r15)
>>> 0x00007fbb38332574: prefetchw 0xc0(%r10)
>>> 0x00007fbb3833257c: movq $0x1,(%rax)
>>> 0x00007fbb38332583: prefetchw 0x100(%r10)
>>> 0x00007fbb3833258b: movl $0x70cb1,0x8(%rax) ; {metadata({type
>>> array int})}
>>> 0x00007fbb38332592: prefetchw 0x140(%r10)
>>> 0x00007fbb3833259a: movl $0x2,0xc(%rax)
>>> 0x00007fbb383325a1: prefetchw 0x180(%r10)
>>> 0x00007fbb383325a9: mov %ebp,0x10(%rax)
>>> 0x00007fbb383325ac: mov %r9d,0x14(%rax) ;*newarray
>>> {reexecute=0 rethrow=0 return_oop=0}
>>> ; - java.util.Arrays::copyOf at 1 (line 3584)
>>> ; - jdk.incubator.vector.IntVector::fromValues at 19 (line 553)
>>> ; - fr.umlv.vector.VectorizedHashCode$Data::hashCode2 at 44 (line 15)
>>> 0x00007fbb383325b0: vmovq 0x10(%rax),%xmm0 ;*invokestatic extract
>>> {reexecute=0 rethrow=0 return_oop=0}
>>> ; - jdk.incubator.vector.Int64Vector::laneHelper at 16 (line 482)
>>> ; - jdk.incubator.vector.Int64Vector::lane at 36 (line 476)
>>> ; - fr.umlv.vector.VectorizedHashCode$Data::hashCode2 at 64 (line 17)
>>> 0x00007fbb383325b5: vpxor 0x10(%r8),%xmm0,%xmm0 ;*invokespecial <init>
>>> {reexecute=0 rethrow=0 return_oop=0}
>>> ; - jdk.internal.vm.vector.VectorSupport$Vector::<init>@2 (line 104)
>>> ; - jdk.incubator.vector.Vector::<init>@2 (line 1122)
>>> ; - jdk.incubator.vector.AbstractVector::<init>@2 (line 67)
>>> ; - jdk.incubator.vector.IntVector::<init>@2 (line 55)
>>> ; - jdk.incubator.vector.Int64Vector::<init>@2 (line 58)
>>> ; - jdk.incubator.vector.Int64Vector::vectorFactory at 5 (line 169)
>>> ; - jdk.incubator.vector.Int64Vector::vectorFactory at 2 (line 41)
>>> ; - jdk.incubator.vector.IntVector$IntSpecies::vectorFactory at 5 (line 3718)
>>> ; - jdk.incubator.vector.IntVector::fromValues at 22 (line 553)
>>> ; - fr.umlv.vector.VectorizedHashCode$Data::hashCode2 at 44 (line 15)
>>> 0x00007fbb383325bb: vpextrd $0x1,%xmm0,%r11d
>>> 0x00007fbb383325c1: vmovd %xmm0,%eax
>>> 0x00007fbb383325c5: xor %r11d,%eax
>>> 0x00007fbb383325c8: vzeroupper
>>>
>>> regards,
>>> Rémi
More information about the panama-dev
mailing list