Using the Vector API to access SIMD instructions

Fri Oct 8 16:50:18 UTC 2021

----- Original Message -----
> From: "raffaello giulietti" <raffaello.giulietti at gmail.com>
> To: "panama-dev at openjdk.java.net'" <panama-dev at openjdk.java.net>
> Sent: Vendredi 8 Octobre 2021 17:16:41
> Subject: Using the Vector API to access SIMD instructions

> Hello,
> 
> I'm implementing two decimal floating-point formats, as defined by the
> IEEE 754 spec. I'm anticipating that they will become primitive classes
> (JEP 401) if I can make them perform reasonably well.
> 
> In Decimal128 I find myself coding something like
> 
>     int c0, c1, c2, c3;
>     long m;
>     int f, ph;
> 
>     ...
> 
>     int d0 = (int) ((m * c0) >>> f);
>     int d1 = (int) ((m * c1) >>> f);
>     int d2 = (int) ((m * c2) >>> f);
>     int d3 = (int) ((m * c3) >>> f);
> 
>     int e0 = c0 - ph * d0;
>     int e1 = c1 - ph * d1;
>     int e2 = c2 - ph * d2;
>     int e3 = c3 - ph * d3;
> 
> I would like to code these repetitive 4 + 4 lines as SIMD operations
> using the Vector API.
> 
> However, it seems to me that I would have to re-code them 3 times,
> depending on the preferred size of the underlying SIMD registers and
> hoping that the preferred size can be constant folded and dead code be
> eliminated by C2.

You can use IntVector.SPECIES_128
  https://docs.oracle.com/en/java/javase/17/docs/api/jdk.incubator.vector/jdk/incubator/vector/IntVector.html#SPECIES_128 
and write the code once and run it everywhere :)

> Also, I would need to first fill a long[4] array with the scalar ci
> values, hoping that the array allocation on the heap is elided by C2.
> 
> Is my understanding correct?
> Does it make sense to use the Vector API in the first place for such
> small "fixed size" usages?
> 
> Thanks for any suggestion
> 
> 
> Greetings
> Raffaello

Rémi