[aarch64-port-dev ] RFR: 8189107 - AARCH64: create intrinsic for pow
Dmitrij Pochepko
dmitrij.pochepko at bell-sw.com
Thu Aug 23 12:31:29 UTC 2018
On 22/08/18 16:43, Andrew Haley wrote:
> On 08/22/2018 11:04 AM, Andrew Dinn wrote:
>> Thank you for the revised webrev and new test results. I am now working
>> through them.
> I wonder about the validity of
>
> L1X+ x *(L2X+ x *(L3X+ x * (L4X+ x *(L5X+ x *L6X)))) is calculated as:
>
> L1X+ x *(L2X+ x *L3X)+ x^3 * (L4X+ x *(L5X+ x *L6X)),
>
> where L1X+ x *(L2X+ x *L3X)
> L4X+ x *(L5X+ x *L6X) are calculated simultaneously in vector (fmlavs)
>
> (On the range [0,0.1716])
>
>
> This transformation looks like a variant of Estrin's scheme, but it's
> not quite the same. I can see no convincing reason why it should be
> invalid, but its rounding and underflow behaviour will be different
> from Horner's scheme. Having said that, the use of fmla should mean
> that the error is less than the original code, which didn't use fused
> multiply-add at all.
>
well, I suppose the most questionable range is where X is near 0 (it's
when input X argument is near 1.0).
I created separate brute force test (run in Xcomp), which compares
Math.pow with StrictMath.pow using all representable double values
within given range and found no differences.
I used input argument range 0.9999...1.0001 (so that X values in this
polynomial are in [0, 0.000049998]. Input argument range has
1.351079888×10¹² double values and for all these values results were
correct.
(test source code:
http://cr.openjdk.java.net/~dpochepk/8189107/PowBruteForce.java)
Thanks,
Dmitrij
More information about the hotspot-compiler-dev
mailing list