[aarch64-port-dev ] RFR: 8189107 - AARCH64: create intrinsic for pow

Thu Aug 23 12:31:29 UTC 2018


On 22/08/18 16:43, Andrew Haley wrote:
> On 08/22/2018 11:04 AM, Andrew Dinn wrote:
>> Thank you for the revised webrev and new test results. I am now working
>> through them.
> I wonder about the validity of
>
>       L1X+ x *(L2X+ x *(L3X+  x   *  (L4X+ x *(L5X+ x *L6X)))) is calculated as:
>
>       L1X+ x *(L2X+ x *L3X)+  x^3 *  (L4X+ x *(L5X+ x *L6X)),
>
> where L1X+ x *(L2X+ x *L3X)
>        L4X+ x *(L5X+ x *L6X) are calculated simultaneously in vector (fmlavs)
>
>        (On the range [0,0.1716])
>
>
> This transformation looks like a variant of Estrin's scheme, but it's
> not quite the same.  I can see no convincing reason why it should be
> invalid, but its rounding and underflow behaviour will be different
> from Horner's scheme.  Having said that, the use of fmla should mean
> that the error is less than the original code, which didn't use fused
> multiply-add at all.
>
well, I suppose the most questionable range is where X is near 0 (it's 
when input X argument is near 1.0).
I created separate brute force test (run in Xcomp), which compares 
Math.pow with StrictMath.pow using all representable double values 
within given range and found no differences.
I used input argument range 0.9999...1.0001 (so that X values in this 
polynomial are in [0, 0.000049998]. Input argument range has 
1.351079888×10¹² double values and for all these values results were 
correct.

(test source code: 
http://cr.openjdk.java.net/~dpochepk/8189107/PowBruteForce.java)

Thanks,
Dmitrij