Re: Faster Math ?

9 Nov 2017

      Hi,

Here are very basic benchmark results from (JaFaMa 2 - FastMathPerf) made
on my laptop (i7-6820HQ set @ 2Ghz + JDK8):
--- testing asin(double) ---
Loop on     Math.asin(double) took 6.675 s
Loop on FastMath.asin(double) took 0.162 s

--- testing acos(double) ---
Loop on     Math.acos(double) took 6.332 s
Loop on FastMath.acos(double) took 0.16 s

--- testing atan(double) ---
Loop on     Math.atan(double) took 0.766 s
Loop on FastMath.atan(double) took 0.167

--- testing sqrt(double) ---
Loop on     Math.sqrt(double), args in [0.0,10.0], took 0.095 s
Loop on FastMath.sqrt(double), args in [0.0,10.0], took 0.097 s
Loop on     Math.sqrt(double), args in [0.0,1.0E12], took 0.109 s
Loop on FastMath.sqrt(double), args in [0.0,1.0E12], took 0.093 s
Loop on     Math.sqrt(double), args in all magnitudes (>=0), took 0.091 s
Loop on FastMath.sqrt(double), args in all magnitudes (>=0), took 0.092

--- testing cbrt(double) ---
Loop on     Math.cbrt(double), args in [-10.0,10.0], took 1.152 s
Loop on FastMath.cbrt(double), args in [-10.0,10.0], took 0.195 s
Loop on     Math.cbrt(double), args in [-1.0E12,1.0E12], took 1.153 s
Loop on FastMath.cbrt(double), args in [-1.0E12,1.0E12], took 0.193 s
Loop on     Math.cbrt(double), args in all magnitudes, took 1.154 s
Loop on FastMath.cbrt(double), args in all magnitudes, took 0.272

--- testing cbrt(double) = pow(double, 1/3) ---
Loop on     Math.pow(double, 1/3), args in [-10.0,10.0], took 0.739 s
Loop on FastMath.cbrt(double), args in [-10.0,10.0], took 0.166 s
Loop on     Math.pow(double, 1/3), args in [-0.7,0.7], took 0.746 s
Loop on FastMath.cbrt(double), args in [-0.7,0.7], took 0.166 s
Loop on     Math.pow(double, 1/3), args in [-0.1,0.1], took 0.742 s
Loop on FastMath.cbrt(double), args in [-0.1,0.1], took 0.165 s
Loop on     Math.pow(double, 1/3), args in all magnitudes, took 0.753 s
Loop on FastMath.cbrt(double), args in all magnitudes, took 0.244

Conclusion:
- acos / asin / atan functions are quite slow: it confirms these are not
optimized by hotspot intrinsics.

- cbrt() is slower than sqrt() : 1.1s vs 0.1 => 10x slower
- cbrt() is slower than pow(1/3) : 1.1s vs 0.7s => 50% slower

Any plan to enhance these specific math operations ?

Laurent

2017-11-09 14:33 GMT+01:00 Laurent Bourgès <bourges.laurent@gmail.com>:
...
I checked in the latest jdk master and both cbrt / acos are NOT intrinsics.
However, cbrt(x) = pow(x, 1/3) so it may be optmized...
Could someone tell me how cbrt() is concretely implemented ?
In native libfdm, there is no e_cbrt.c !
Thanks for your help,
Laurent
Le 9 nov. 2017 10:52 AM, "Jonas Konrad" <me@yawk.at> a écrit :
...
Hey,
Most functions in the Math class are intrinsic (
http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/tip/src/
share/vm/classfile/vmSymbols.hpp#l664 ) and will use native instructions
where available. You can also test this yourself using jitwatch. There is
no native call overhead.
The standard library does not currently include less accurate but faster
Math functions, maybe someone else can answer if that is something to be
considered.
- Jonas Konrad
On 11/09/2017 10:00 AM, Laurent Bourgès wrote:
...
Hi,
The Marlin renderer (JEP265) uses few Math functions: sqrt, cbrt, acos...
Could you check if the current JDK uses C2 intrinsics or libfdm (native /
JNI overhead?) and tell me if such functions are already highly optimized
in jdk9 or 10 ?
Some people have implemented their own fast Math like Apache Commons Math
or JaFaMa libraries that are 10x faster for acos / cbrt.
I wonder if I should implement my own cbrt function (cubics) in pure java
as I do not need the highest accuracy but SPEED.
Would it sound possible to have a JDK FastMath public API (lots faster
but
less accurate?)
Do you know if recent CPU (intel?) have dedicated instructions for such
math operations ?
Why not use it instead?
Maybe that's part of the new Vectorization API (panama) ?
Cheers,
Laurent Bourges
-- 
-- 
Laurent Bourgès