Math trig intrinsics and compiler options
Joseph D. Darcy
Joe.Darcy at Sun.COM
Thu Jul 16 08:42:42 PDT 2009
Christian Thalinger wrote:
> Joseph D. Darcy wrote:
>
>> Christian Thalinger wrote:
>>
>>> gustav trede wrote:
>>>
>>>
>>>> Hello,
>>>>
>>>> Azeem Jiva told me an easy way to improve trig performance.
>>>> Changing the intrinsics to use an existing but faster path gives me a
>>>> boost of roughly 40% for the Math cos and sin on solaris x64.
>>>>
>>>>
>>>> library_call.cpp
>>>>
>>>> bool LibraryCallKit::inline_math_
>>>> native(vmIntrinsics::ID id) {
>>>> switch (id) {
>>>>
>>>> case vmIntrinsics::_dcos: return Matcher::has_match_rule(Op_CosD) ?
>>>> runtime_math(OptoRuntime::Math_D_D_Type(), CAST_FROM_FN_PTR(address,
>>>> SharedRuntime::dcos), "COS") : false;
>>>> case vmIntrinsics::_dsin: return Matcher::has_match_rule(Op_SinD) ?
>>>> runtime_math(OptoRuntime::Math_D_D_Type(), CAST_FROM_FN_PTR(address,
>>>> SharedRuntime::dsin), "SIN") : false;
>>>> case vmIntrinsics::_dtan: return Matcher::has_match_rule(Op_TanD) ?
>>>> runtime_math(OptoRuntime::Math_D_D_Type(), CAST_FROM_FN_PTR(address,
>>>> SharedRuntime::dtan), "TAN") : false;
>>>> case vmIntrinsics::_dlog: return Matcher::has_match_rule(Op_LogD) ?
>>>> runtime_math(OptoRuntime::Math_D_D_Type(), CAST_FROM_FN_PTR(address,
>>>> SharedRuntime::dlog), "LOG") : false;
>>>> case vmIntrinsics::_dlog10: return Matcher::has_match_rule(Op_Log10D)
>>>> ? runtime_math(OptoRuntime::Math_D_D_Type(), CAST_FROM_FN_PTR(address,
>>>> SharedRuntime::dlog10), "LOG10") : false;
>>>>
>>>>
>>>> Is there any potential problem with such a patch ?
>>>>
>>>>
>>> I'm not sure I understand this "patch". Why should the code above be
>>> faster than the current code in HotSpot, which tries to inline the
>>> trigonometric functions? Maybe the speedup you're seeing is because of
>>> the missing fast/slow path check and you are always using the slow path
>>> because of rounding? Just a guess...
>>>
>>>
>> Hello.
>>
>> Can you explain the nature of the selection difference?
>>
>
> It seems all boils down to:
>
> // Check: If PI/4 < abs(arg) then go slow
>
Yes, the fsin/fcos instructions can be used directly if abs(arg) < pi/4.
> As also explained in CR 4345903:
>
> The solution is to have the Math.{sin, cos} do their own argument
> reduction to [-pi/4, pi/4] and then call fsin/fcos; this will
> guarantee the specified accuracy and monotonicity properties.
>
More than double precision is needed to hold the reduced argument, and
as I recall, more than the double extended precision available on the
x87 as well, so that approach is not directly applicable.
-Joe
>
>> Better semantics are provided if the instrinsified versions of sin, cos,
>> etc. are always used to implement the java.lang.Math flavor of those
>> methods.
>>
>
> -- Christian
>
More information about the hotspot-dev
mailing list