RFR: 8265491: Math Signum optimization for x86 [v5]

Jie Fu jiefu at openjdk.java.net
Wed Apr 21 23:56:22 UTC 2021


On Wed, 21 Apr 2021 18:44:14 GMT, Marcus G K Williams <github.com+168222+mgkwill at openjdk.org> wrote:

>> x86 Math.Signum() uses two floating point compares and a copy sign operation involving data movement to gpr and XMM.
>> 
>> We can optimize to one floating point compare and sign computation in XMM. We observe ~25% performance improvement with this optimization.
>> 
>> Base:
>> 
>> Benchmark                       Mode Cnt Score Error Units
>> Signum._1_signumFloatTest avgt 5 4.660 ? 0.040 ns/op
>> Signum._2_overheadFloat avgt 5 3.314 ? 0.023 ns/op
>> Signum._3_signumDoubleTest avgt 5 4.809 ? 0.043 ns/op
>> Signum._4_overheadDouble avgt 5 3.313 ? 0.015 ns/op
>> 
>>  
>> Optimized:
>> signum intrinsic patch
>> 
>> Benchmark                       Mode  Cnt  Score   Error  Units
>> Signum._1_signumFloatTest       avgt    5  3.769 ? 0.015  ns/op
>> Signum._2_overheadFloat         avgt    5  3.312 ? 0.025  ns/op
>> Signum._3_signumDoubleTest      avgt    5  3.765 ? 0.005  ns/op
>> Signum._4_overheadDouble        avgt    5  3.309 ? 0.010  ns/op
>> 
>> 
>> Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
>
> Marcus G K Williams has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Reorder jcc equal,parity
>    
>    Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>
>  - Use xorp to negate 1
>    
>    Signed-off-by: Marcus G K Williams <marcus.williams at intel.com>

src/hotspot/cpu/x86/vm_version_x86.cpp line 1703:

> 1701:   }
> 1702: #endif // !PRODUCT
> 1703:   if (FLAG_IS_DEFAULT(UseSignumIntrinsic) && (UseSSE >= 2)) {

`UseSSE >=2` looks a bit uncomfortable to me since signumF_reg only requires UseSSE>=1. 

How about something like this:

diff --git a/src/hotspot/cpu/x86/x86.ad b/src/hotspot/cpu/x86/x86.ad
index bbf5256c112..0a738704c4d 100644
--- a/src/hotspot/cpu/x86/x86.ad
+++ b/src/hotspot/cpu/x86/x86.ad
@@ -1598,6 +1598,16 @@ const bool Matcher::match_rule_supported(int opcode) {
         return false;
       }
       break;
+    case Op_SignumF:
+      if (UseSSE < 1) {
+        return false;
+      }
+      break;
+    case Op_SignumD:
+      if (UseSSE < 2) {
+        return false;
+      }
+      break;
 #endif // !LP64
   }
   return true;  // Match rules are supported by default.
diff --git a/src/hotspot/share/opto/library_call.cpp b/src/hotspot/share/opto/library_call.cpp
index 7cb7955c612..217ee39d709 100644
--- a/src/hotspot/share/opto/library_call.cpp
+++ b/src/hotspot/share/opto/library_call.cpp
@@ -1737,8 +1737,8 @@ bool LibraryCallKit::inline_math_native(vmIntrinsics::ID id) {
   case vmIntrinsics::_dpow:      return inline_math_pow();
   case vmIntrinsics::_dcopySign: return inline_double_math(id);
   case vmIntrinsics::_fcopySign: return inline_math(id);
-  case vmIntrinsics::_dsignum: return inline_double_math(id);
-  case vmIntrinsics::_fsignum: return inline_math(id);
+  case vmIntrinsics::_dsignum: return Matcher::match_rule_supported(Op_SignumD) ? inline_double_math(id) : false;
+  case vmIntrinsics::_fsignum: return Matcher::match_rule_supported(Op_SignumF) ? inline_math(id) : false;
 
    // These intrinsics are not yet correctly implemented
   case vmIntrinsics::_datan2:


In this way, we can enable UseSignumIntrinsic by default on x86.
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3581


More information about the hotspot-compiler-dev mailing list