Hi, Here are very basic benchmark results from (JaFaMa 2 - FastMathPerf) made on my laptop (i7-6820HQ set @ 2Ghz + JDK8): --- testing asin(double) --- Loop on Math.asin(double) took 6.675 s Loop on FastMath.asin(double) took 0.162 s --- testing acos(double) --- Loop on Math.acos(double) took 6.332 s Loop on FastMath.acos(double) took 0.16 s --- testing atan(double) --- Loop on Math.atan(double) took 0.766 s Loop on FastMath.atan(double) took 0.167 --- testing sqrt(double) --- Loop on Math.sqrt(double), args in [0.0,10.0], took 0.095 s Loop on FastMath.sqrt(double), args in [0.0,10.0], took 0.097 s Loop on Math.sqrt(double), args in [0.0,1.0E12], took 0.109 s Loop on FastMath.sqrt(double), args in [0.0,1.0E12], took 0.093 s Loop on Math.sqrt(double), args in all magnitudes (>=0), took 0.091 s Loop on FastMath.sqrt(double), args in all magnitudes (>=0), took 0.092 --- testing cbrt(double) --- Loop on Math.cbrt(double), args in [-10.0,10.0], took 1.152 s Loop on FastMath.cbrt(double), args in [-10.0,10.0], took 0.195 s Loop on Math.cbrt(double), args in [-1.0E12,1.0E12], took 1.153 s Loop on FastMath.cbrt(double), args in [-1.0E12,1.0E12], took 0.193 s Loop on Math.cbrt(double), args in all magnitudes, took 1.154 s Loop on FastMath.cbrt(double), args in all magnitudes, took 0.272 --- testing cbrt(double) = pow(double, 1/3) --- Loop on Math.pow(double, 1/3), args in [-10.0,10.0], took 0.739 s Loop on FastMath.cbrt(double), args in [-10.0,10.0], took 0.166 s Loop on Math.pow(double, 1/3), args in [-0.7,0.7], took 0.746 s Loop on FastMath.cbrt(double), args in [-0.7,0.7], took 0.166 s Loop on Math.pow(double, 1/3), args in [-0.1,0.1], took 0.742 s Loop on FastMath.cbrt(double), args in [-0.1,0.1], took 0.165 s Loop on Math.pow(double, 1/3), args in all magnitudes, took 0.753 s Loop on FastMath.cbrt(double), args in all magnitudes, took 0.244 Conclusion: - acos / asin / atan functions are quite slow: it confirms these are not optimized by hotspot intrinsics. - cbrt() is slower than sqrt() : 1.1s vs 0.1 => 10x slower - cbrt() is slower than pow(1/3) : 1.1s vs 0.7s => 50% slower Any plan to enhance these specific math operations ? Laurent 2017-11-09 14:33 GMT+01:00 Laurent Bourgès <bourges.laurent@gmail.com>:
I checked in the latest jdk master and both cbrt / acos are NOT intrinsics.
However, cbrt(x) = pow(x, 1/3) so it may be optmized...
Could someone tell me how cbrt() is concretely implemented ?
In native libfdm, there is no e_cbrt.c !
Thanks for your help, Laurent
Le 9 nov. 2017 10:52 AM, "Jonas Konrad" <me@yawk.at> a écrit :
Hey,
Most functions in the Math class are intrinsic ( http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/tip/src/ share/vm/classfile/vmSymbols.hpp#l664 ) and will use native instructions where available. You can also test this yourself using jitwatch. There is no native call overhead.
The standard library does not currently include less accurate but faster Math functions, maybe someone else can answer if that is something to be considered.
- Jonas Konrad
On 11/09/2017 10:00 AM, Laurent Bourgès wrote:
Hi,
The Marlin renderer (JEP265) uses few Math functions: sqrt, cbrt, acos...
Could you check if the current JDK uses C2 intrinsics or libfdm (native / JNI overhead?) and tell me if such functions are already highly optimized in jdk9 or 10 ?
Some people have implemented their own fast Math like Apache Commons Math or JaFaMa libraries that are 10x faster for acos / cbrt.
I wonder if I should implement my own cbrt function (cubics) in pure java as I do not need the highest accuracy but SPEED.
Would it sound possible to have a JDK FastMath public API (lots faster but less accurate?)
Do you know if recent CPU (intel?) have dedicated instructions for such math operations ? Why not use it instead? Maybe that's part of the new Vectorization API (panama) ?
Cheers, Laurent Bourges
-- -- Laurent Bourgès