Vector API - Request to expose horizontal adds
Paul Sandoz
paul.sandoz at oracle.com
Mon Apr 26 23:00:29 UTC 2021
HI Stephane,
[Apologies for the delay in replying. Emails from non-members were queued up for approval and we forgot to approve ‘em]
Right now there is no direct support for such a hardware instruction. IIRC is used indirectly to perform reductions when composed with itself or other vector instructions.
It's possible to emulate it using rearrange/shuffle operations [*], but that will result in many vector hardware instructions (and optimization of such operations are still work in progress). In principle C2 could detect a pattern of vector operations and replace with a horizontal add, but that is likely fragile and perhaps too specialized to be worth the cost.
It’s an unusual instruction in some respects, not quite fitting in with the API design of lanewise operations, nor reduction operations. Such addition would require some special API treatment, perhaps a new special form of VectorOperators.Operator.
I am inclined to sit and wait to see how important this is, then look more broadly at other similar kinds of hardware instructions, to see if a common pattern emerges that we can capture, before committing to a special case.
Paul.
[*]
IntVector v1 = IntVector.fromArray(SPECIES, new int[]{10, 11, 12, 13, 14, 15, 16, 17}, 0);
IntVector v2 = IntVector.fromArray(SPECIES, new int[]{20, 21, 22, 23, 24, 25, 26, 27}, 0);
VectorShuffle<Integer> s1 = SPECIES.shuffleFromArray(new int[]{0, 2, -8, -6, 4, 6, -4, -2}, 0);
VectorShuffle<Integer> s2 = SPECIES.shuffleFromArray(new int[]{1, 3, -7, -5, 5, 7, -3, -1}, 0);
// emulate _mm256_hadd_epi32
IntVector v3 = v1.rearrange(s1, v2);
IntVector v4 = v1.rearrange(s2, v2);
IntVector v5 = v3.add(v4);
> On Mar 23, 2021, at 4:14 PM, Stephane Dubedat <stephane.dubedat2 at mail.dcu.ie> wrote:
>
> Hi, this is my first message on an openjdk mailing list, I hope this is
> appropriate.
> I am currently translating a C# SIMD codebase to the Java Vector API and
> the only operation that I don't know how to translate is Horizontal Add.
> In C# it is Sse3.HorizontalAdd(sum, sum); and correspond to the following
> intrinsic __m128d _mm_hadd_pd (__m128d a, __m128d b)
> So my question is: do you plan to expose this operation ? And in the
> meantime how can I efficiently achieve this?
>
> Thanks,
> respectfully,
> Stephane Dubedat
>
> --
> __
>
> Séanadh Ríomhphoist/_
>
> Email Disclaimer__
> **
>
> Tá an ríomhphost seo agus
> aon chomhad a sheoltar leis faoi rún agus is lena úsáid ag an seolaí agus
> sin amháin é. Is féidir tuilleadh a léamh anseo.
> <https://www4.dcu.ie/iss/seanadh-riomhphoist.shtml>
> <https://www4.dcu.ie/iss/seanadh-riomhphoist.shtml>*
> _
>
> This e-mail and any
> files transmitted with it are confidential and are intended solely for use
> by the addressee. Read more here.
> <https://www4.dcu.ie/iss/email-disclaimer.shtml> _
> *_
More information about the panama-dev
mailing list