[aarch64-port-dev ] RFR: 8189107 - AARCH64: create intrinsic for pow
Andrew Haley
aph at redhat.com
Wed Aug 22 13:43:25 UTC 2018
On 08/22/2018 11:04 AM, Andrew Dinn wrote:
> Thank you for the revised webrev and new test results. I am now working
> through them.
I wonder about the validity of
L1X+ x *(L2X+ x *(L3X+ x * (L4X+ x *(L5X+ x *L6X)))) is calculated as:
L1X+ x *(L2X+ x *L3X)+ x^3 * (L4X+ x *(L5X+ x *L6X)),
where L1X+ x *(L2X+ x *L3X)
L4X+ x *(L5X+ x *L6X) are calculated simultaneously in vector (fmlavs)
(On the range [0,0.1716])
This transformation looks like a variant of Estrin's scheme, but it's
not quite the same. I can see no convincing reason why it should be
invalid, but its rounding and underflow behaviour will be different
from Horner's scheme. Having said that, the use of fmla should mean
that the error is less than the original code, which didn't use fused
multiply-add at all.
--
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the hotspot-compiler-dev
mailing list