[aarch64-port-dev ] RFR: 8189107 - AARCH64: create intrinsic for pow

Andrew Haley aph at redhat.com
Wed Aug 22 13:43:25 UTC 2018


On 08/22/2018 11:04 AM, Andrew Dinn wrote:
> Thank you for the revised webrev and new test results. I am now working
> through them. 

I wonder about the validity of

     L1X+ x *(L2X+ x *(L3X+  x   *  (L4X+ x *(L5X+ x *L6X)))) is calculated as:

     L1X+ x *(L2X+ x *L3X)+  x^3 *  (L4X+ x *(L5X+ x *L6X)),

where L1X+ x *(L2X+ x *L3X)
      L4X+ x *(L5X+ x *L6X) are calculated simultaneously in vector (fmlavs)

      (On the range [0,0.1716])


This transformation looks like a variant of Estrin's scheme, but it's
not quite the same.  I can see no convincing reason why it should be
invalid, but its rounding and underflow behaviour will be different
from Horner's scheme.  Having said that, the use of fmla should mean
that the error is less than the original code, which didn't use fused
multiply-add at all.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


More information about the hotspot-compiler-dev mailing list