[vectorIntrinsics] Processing interleaved data formats

Paul Sandoz paul.sandoz at oracle.com
Thu Apr 15 22:19:28 UTC 2021


Hi Peter,

If you sent you code as an attachment, then it got removed by the email server. Share as gist?

A shuffle can be used to rearrange elements in a vector. A shuffle redirects lane elements.

In the case of a matrix of complex elements it should be possible to use this method:

  Vector<E> rearrange(VectorShuffle<E> s, Vector<E> v);

e.g. two vectors loaded from the matrix buffer, rearrange to produce one vector containing the real elements. However, there is some cost to that.

There are also load/store operations that accept an indexMap e.g.:

IntVector fromArray(VectorSpecies<Integer> species,
                               int[] a, int offset,
                               int[] indexMap, int mapOffset) {

These may be more appropriate for your needs. Again there is a cost to those. We don’t have load stores for an iota-like pattern, that may be more optimal for linear structures.

It would be interesting to look at some C/asm vector code for inspiration. There is probably some clever way with a shuffle rotate blend combination (with some redundant calculations) [*].

Paul.

[*]

Off the top of my head:

    R1 I1 R2 I2 R3 I3 R4 I4
              *
    r1 i1 r2 i2 r3 i3 r4 i4
              =
    R1*r1 I1*i1 R2*r2 I2*i2 R3*r3 I3*i3 R4*r4 I4*i4
              -
    I1*i1 R2*r2 I2*i2 R3*r3 I3*i3 R4*r4 I4*i4 R1*r1 // rotate by -1
              =
             v1

    R1 I1 R2 I2 R3 I3 R4 I4
              *
    i1 r1 i2 r2 i3 r3 i4 r4  // swap r and I using a shuffle
              =
    R1*i1 I1*r1 R2*i2 I2*r2 R3*i3 I3*r3 R4*i4 I4*r4
              +
    I4*r4 R1*i1 I1*r1 R2*i2 I2*r2 R3*i3 I3*r3 R4*i4 // rotate by 1
              =
             v2

    v = v1.blend(v2, mask(01010101))

> On Apr 14, 2021, at 6:40 PM, Peter A <peter.abeles at gmail.com> wrote:
> 
> I'm attempting to vectorize interleaved data formats. For example, complex
> matrices are often stored in an array where elements alternate between real
> and imaginary values. This also comes up in low level image processing,
> e.g. YUV to RGB and debayer.
> 
> Here's what the actual math looks like for complex multiplication of a
> scalar value against a vector:
> 
> double realA = ...
> double imagA = ...
> 
> while (indexB < end) {
>    double realB = B.data[indexB++];
>    double imagB = B.data[indexB++];
> 
>    C.data[indexC++] = realA * realB - imagA * imagB;
>    C.data[indexC++] = realA * imagB + imagA * realB;
> }
> 
> My attempts so far have failed since at some point I need to address the
> memory not being "continuous" and I resort to non vectorized code. I did a
> quick search and it seems like I need to use a "shuffle" command. I did
> find a shuffle in the Vector API but it wasn't obvious how to apply it
> here, to me at least.
> 
> I suspect that some others here know exactly how to approach this problem.
> 
> Thanks,
> - Peter
> 
> P.S.  I'm sharing this code so others can learn from it too.
> -- 
> "Now, now my good man, this is no time for making enemies."    — Voltaire
> (1694-1778), on his deathbed in response to a priest asking that he
> renounce Satan.



More information about the panama-dev mailing list