[ a newby question about vectorized shuffling / permutation ]
Paul Sandoz
paul.sandoz at oracle.com
Tue Jun 22 22:47:46 UTC 2021
Hi Denis,
Thanks for sharing, perhaps you can share the complete JMH benchmark?
In your current approach there may be no benefit to using a vector for the right side, since you unpack the lane elements, just use the array elements directly as input to the broadcast eq.
I would be interested in seeing the shuffle version, if you have it. Some Shuffle/Mask stuff is still work in progress, but it could still be a good test case.
Paul.
> On Jun 20, 2021, at 2:26 PM, Denis Gabaydulin <gabaden at gmail.com> wrote:
>
> If somebody is interested in the result, here's my solution (which is
> twice faster on SPECIES_128).
>
> https://gist.github.com/sherman/6d8a4103ba5bb3d2911684b8c842647f
>
> Benchmark with the intersection in the middle of arrays.
>
> Benchmark Mode Cnt Score Error Units
> IntersectionBenchmark.hasIntersectionScalar thrpt 10 3.785 ± 0.095 ops/us
> IntersectionBenchmark.hasIntersectionVector thrpt 10 7.008 ± 0.795 ops/us
>
> What I do:
> Changed the shuffling op by the broadcast op.
>
> Here's the most significant part in the asm:
>
> https://gist.github.com/sherman/7788e37b667940f31531ce74cb53b4e5
>
> Would be very appreciated any suggestions and comments on how it can
> be improved in terms of throughput?
>
> On Wed, Jun 2, 2021 at 2:04 PM Denis Gabaydulin <gabaden at gmail.com> wrote:
>>
>> Hi, folks!
>>
>> Have a newbie question about the API.
>> I'm trying to explore a new and shiny vector API with simple but useful examples.
>>
>> One of that is to build a SIMD version to find any intersection of two sorted lists.
>>
>> E.g.
>> hasIntersection([1, 3, 4, 10], [2, 6, 6, 11]) => false
>> hasIntersection([1, 3, 4, 10], [2, 3, 6, 11]) => true
>>
>> A mental model of this algorithm which I have:
>>
>> 1). to get a part of an array one
>> 2). compare it n-times which corresponding part of an array two.
>> 3). each time we need to compare two parts of arrays we need to shuffle (permutate) a left part by a mask. E.g. 1, 3, 4, 10 -> 10, 1, 3, 4
>> 4). if one of a number is the same, return true, otherwise, move.
>> In SIMD instructions looks like I need two major things.
>>
>> 1). vectorized comparison instruction
>> 2). vectorized shuffle instruction.
>>
>> The first one, I can see in the printed asm.
>> compare in API -> vpcmpeqd
>>
>> The question, which function of the vector API should I use to getting something like _mm_shuffle_epi32 ?
>>
>>
>>
>>
>
> On Wed, Jun 2, 2021 at 2:04 PM Denis Gabaydulin <gabaden at gmail.com> wrote:
>>
>> Hi, folks!
>>
>> Have a newbie question about the API.
>> I'm trying to explore a new and shiny vector API with simple but useful examples.
>>
>> One of that is to build a SIMD version to find any intersection of two sorted lists.
>>
>> E.g.
>> hasIntersection([1, 3, 4, 10], [2, 6, 6, 11]) => false
>> hasIntersection([1, 3, 4, 10], [2, 3, 6, 11]) => true
>>
>> A mental model of this algorithm which I have:
>>
>> 1). to get a part of an array one
>> 2). compare it n-times which corresponding part of an array two.
>> 3). each time we need to compare two parts of arrays we need to shuffle (permutate) a left part by a mask. E.g. 1, 3, 4, 10 -> 10, 1, 3, 4
>> 4). if one of a number is the same, return true, otherwise, move.
>> In SIMD instructions looks like I need two major things.
>>
>> 1). vectorized comparison instruction
>> 2). vectorized shuffle instruction.
>>
>> The first one, I can see in the printed asm.
>> compare in API -> vpcmpeqd
>>
>> The question, which function of the vector API should I use to getting something like _mm_shuffle_epi32 ?
>>
>>
>>
>>
More information about the panama-dev
mailing list