_mm256_round_ps operation

Thu May 30 16:38:35 UTC 2024

Hi,

(I meant to refer to Math.rint rather than Math.round.)

Thanks you for sharing the use case, very helpful. Myself and a colleague searched a little bit and we think we found some of your code, which was helpful to understand more.

https://github.com/babylonml/babylonml/blob/main/src/main/java/com/tornadoml/cpu/MatrixOperations.java#L107

Presumably this is more beneficial for lower precision floating point numbers? Like float32 or even float16?

I checked how scalar implementations of Math.rint, Math.floor, and Math.ceil, are intrinsified by the C2 compiler. On my AVX2 machine they compile down to a vroundsd instruction with rounding mode of 0, 1, and 2 respectively, and they also auto-vectorize in loops. I think it would be relatively easy to add support for vector float/double for all three rounding modes as three unary operations.

I logged this issue https://bugs.openjdk.org/browse/JDK-8333293

Paul.

> On May 29, 2024, at 12:58 AM, Andrii Lomakin <andrii0lomakin at gmail.com> wrote:
> 
> Hi Paul.
> Thank you for getting back to me.
> 
> Let me explain why  I need it. You may find such a demand reasonable.
> I need it to calculate SoftMax in a two-phase fashion.
> 
> Straightforward calculation of e^x leads to the overflow on a short-range interval, so a typical trip is to normalize exponential part by finding the maximum exponent value and subtracting the maximum value from the exponent, leading to the three-phase calculation of SoftMax.
> However, it is possible to limit the exponent of the e^x to a relatively short range using this formula.
> <New Bitmap image.bmp>
> That leads to the two-phase algorithm for the calculation of SoftMax.
> I suppose that other ML guys will find it helpful to use round vector operation in such cases.
> 
> I have implemented it using sign extraction using shit and conversation operations, but I would prefer to use intrinsics instead.
> 
> 
> On Wed, May 29, 2024 at 1:20 AM Paul Sandoz <paul.sandoz at oracle.com> wrote:
> Hi Andrii,
> 
> You were looking in the right place, and you could not find it because it does not exist :-)
> 
> So far we have not encountered any demand, but it should be possible to add such support as long as we can be compatible with the behavior of Math.round. (We could also consider including compatible unary vector operations for Math.ciel and Math.floor). 
> 
> Paul.
> 
> > On May 25, 2024, at 11:59 PM, Andrii Lomakin <andrii0lomakin at gmail.com> wrote:
> > 
> > Hi guys.
> > 
> > I suppose I missed something, but I can not find roundp operation in FloatVector API.
> > I have searched over usage of unary() method but again can not find it, though I found neg and abs. 
> > 
> > Could you point me in the right direction?
>