Math trig intrinsics and compiler options

Joseph D. Darcy Joe.Darcy at Sun.COM
Thu Jul 16 08:42:42 PDT 2009


Christian Thalinger wrote:
> Joseph D. Darcy wrote:
>   
>> Christian Thalinger wrote:
>>     
>>> gustav trede wrote:
>>>   
>>>       
>>>> Hello,
>>>>
>>>> Azeem Jiva told me an easy way to improve trig performance.
>>>> Changing the intrinsics to use an existing but faster path gives me a
>>>> boost of roughly 40% for the Math cos and sin on solaris x64.
>>>>
>>>>
>>>> library_call.cpp
>>>>
>>>> bool LibraryCallKit::inline_math_
>>>> native(vmIntrinsics::ID id) {
>>>>   switch (id) {
>>>>
>>>>    case vmIntrinsics::_dcos: return Matcher::has_match_rule(Op_CosD) ?
>>>> runtime_math(OptoRuntime::Math_D_D_Type(), CAST_FROM_FN_PTR(address,
>>>> SharedRuntime::dcos), "COS") : false;
>>>>    case vmIntrinsics::_dsin: return Matcher::has_match_rule(Op_SinD) ?
>>>> runtime_math(OptoRuntime::Math_D_D_Type(), CAST_FROM_FN_PTR(address,
>>>> SharedRuntime::dsin), "SIN") : false;
>>>>    case vmIntrinsics::_dtan: return Matcher::has_match_rule(Op_TanD) ?
>>>> runtime_math(OptoRuntime::Math_D_D_Type(), CAST_FROM_FN_PTR(address,
>>>> SharedRuntime::dtan), "TAN") : false;
>>>>    case vmIntrinsics::_dlog:   return Matcher::has_match_rule(Op_LogD) ?
>>>> runtime_math(OptoRuntime::Math_D_D_Type(), CAST_FROM_FN_PTR(address,
>>>> SharedRuntime::dlog), "LOG") : false;
>>>>    case vmIntrinsics::_dlog10: return Matcher::has_match_rule(Op_Log10D)
>>>> ? runtime_math(OptoRuntime::Math_D_D_Type(), CAST_FROM_FN_PTR(address,
>>>> SharedRuntime::dlog10), "LOG10") : false;
>>>>
>>>>
>>>> Is there any potential problem with such a patch  ?
>>>>     
>>>>         
>>> I'm not sure I understand this "patch".  Why should the code above be
>>> faster than the current code in HotSpot, which tries to inline the
>>> trigonometric functions?  Maybe the speedup you're seeing is because of
>>> the missing fast/slow path check and you are always using the slow path
>>> because of rounding?  Just a guess...
>>>   
>>>       
>> Hello.
>>
>> Can you explain the nature of the selection difference?
>>     
>
> It seems all boils down to:
>
>     // Check: If PI/4 < abs(arg) then go slow
>   

Yes, the fsin/fcos instructions can be used directly if abs(arg) < pi/4.

> As also explained in CR 4345903:
>
> The solution is to have the Math.{sin, cos} do their own argument
> reduction to [-pi/4, pi/4] and then call fsin/fcos; this will
> guarantee the specified accuracy and monotonicity properties.
>   

More than double precision is needed to hold the reduced argument, and 
as I recall, more than the double extended precision available on the 
x87 as well, so that approach is not directly applicable.

-Joe

>   
>> Better semantics are provided if the instrinsified versions of sin, cos, 
>> etc. are always used to implement the java.lang.Math flavor of those 
>> methods.
>>     
>
> -- Christian
>   




More information about the hotspot-dev mailing list