VectorAPI: SubAll intrinsics for byte, short, float and double
Lupusoru, Razvan A
razvan.a.lupusoru at intel.com
Tue May 1 23:46:30 UTC 2018
Just to chime in - I completely agree that for FP we could choose doing parallel instead of sequential operations. In fact, I imagine that in the coming future we will definitely have a data parallel implementation - so it would be good to think of how API would allow that to happen (either by default or explicitly).
There are a couple of reasons we have not done the data parallel approach yet:
- We are reusing reduction nodes that vectorizer uses and those are required to be sequential. We would have to add additional metadata to reduction nodes to keep track of how reduction is intended to be done
- Since our priority right now is to have some vector implementation for each method in API, it is easier as first step to do the sequential order for FP (since there are existing examples in code that we can use as example)
Thanks again for your thorough review!
--Razvan
-----Original Message-----
From: panama-dev [mailto:panama-dev-bounces at openjdk.java.net] On Behalf Of Paul Sandoz
Sent: Tuesday, May 01, 2018 9:30 AM
To: Halimi, Jean-Philippe <jean-philippe.halimi at intel.com>
Cc: panama-dev at openjdk.java.net
Subject: Re: VectorAPI: SubAll intrinsics for byte, short, float and double
> On May 1, 2018, at 4:33 AM, Halimi, Jean-Philippe <jean-philippe.halimi at intel.com> wrote:
>
> Hi Paul,
>
> From my understanding, it is possible to reduce subAll() into addAll().neg() in the case of integral types. Horizontal arithmetic is allowed in this case because the order of operation does not matter. In fact, the add reduction we transform subAll into does that.
Excellent!
The second patch (webrev_subAll_BS_v1.0) looks good.
> For FP, you are right saying we need to keep the order because of the limited precision.
Right, but we also have some wiggle room to change that order, trading predictable results (compared to the scalar operation) for speed. It all depends on how we specify the behavior and whether we require some configuration loosing controlling precision.
> In this case, I am not aware of a data parallel approach we could use to speed up the computation.
>
e.g.
v1 = a b c d
v2 = shuffle v1 = c d c d
v3 = v1 + v2 = (a + c) (b + d) … …
v4 = shuffle v3 = (b + d) (b + d) … …
v5 = v4 + v3 = ((a + c) + (b + d)) … … …
For FP we may need to retain sequential and data parallel approaches.
There is probably a rich vein of academic literature on this topic, perhaps including vectorized khan summation (something that if important we should add as a separate operation IMHO to return two values, the sum and sum with compensation).
> Let me know if I missed your point.
>
You got it.
Thanks,
Paul.
> Thanks,
>
> Jp
>
> -----Original Message-----
> From: Paul Sandoz [mailto:paul.sandoz at oracle.com]
> Sent: Monday, April 30, 2018 4:51 PM
> To: Halimi, Jean-Philippe <jean-philippe.halimi at intel.com>
> Cc: panama-dev at openjdk.java.net
> Subject: Re: VectorAPI: SubAll intrinsics for byte, short, float and double
>
> Hi Jp,
>
> Looks ok. Can we derive subAll from addAll().neg(), the additional negation might be an acceptable cost but i am uncertain of the FP behavior.
>
> IIUC, for reductive addition or subtraction, the accumulated value is kept in first lane of the destination register and the src lane element to subtract is shuffled down for each iteration. In effect it preserves the sequential order, but i wonder if there are faster data parallel approaches if we are relaxed about rounding producing different results?
>
> Thanks,
> Paul.
>
>> On Apr 30, 2018, at 10:10 AM, Halimi, Jean-Philippe <jean-philippe.halimi at intel.com> wrote:
>>
>> Hi all,
>>
>>
>>
>> I would like to share a patch adding support for subAll intrinsic for byte, short, long, float and double types in VectorAPI.
>>
>>
>>
>> Could you please review the two following patches?
>>
>> http://cr.openjdk.java.net/~jphalimi/webrev_subAll_FP_v1.1/
>>
>> http://cr.openjdk.java.net/~jphalimi/webrev_subAll_BS_v1.0/
>>
>>
>>
>> Thank you,
>>
>>
>>
>> Jp
>>
>
More information about the panama-dev
mailing list