[16] RFR(S): 8251525: AARCH64: Faster Math.signum(fp)

Thu Aug 13 11:51:59 UTC 2020

Hi Dmitry,

Some comments on shared code changes:

src/hotspot/share/opto/library_call.cpp:

+  case vmIntrinsics::_dsignum:
+    return UseSignumIntrinsic && 
(Matcher::match_rule_supported(Op_SignumD) ? inline_double_math(id) : 
false);

There's no need in repeating UseSignumIntrinsic and 
(Matcher::match_rule_supported(Op_SignumD) checks.
C2Compiler::is_intrinsic_supported() already covers taht.

src/hotspot/share/opto/signum.hpp:

   32 class SignumNode : public Node {
   33 public:
   34   SignumNode(Node* in) : Node(0, in) {}
   35   virtual int Opcode() const;
   36   virtual const Type *bottom_type() const { return NULL; }
   37   virtual uint ideal_reg() const { return Op_RegD; }
   38 };

Any particular reason to keep SignumNode? I don't see any and would just 
drop it.

Also, having a dedicated header file just for a couple of nodes with 
trivial implementations looks like an overkill. As an alternative 
location, intrinsicnode.cpp should be a better option.

Best regards,
Vladimir Ivanov

On 13.08.2020 14:04, Dmitry Chuyko wrote:
> Hello,
> 
> Please review a faster version of Math.signum() for AArch64.
> 
> Two new intrinsics (double and float) are introduced in general code, 
> with appropriate new nodes. New JTreg test is added to cover the 
> intrinsic case (enabled only for aarch64).
> 
> AArch64 implementation uses FACGT (compare abslute fp values) and BSL 
> (fp bit selection) to avoid branches and moves to non-fp registers and 
> back.
> 
> Performance results show ~30% better time in the benchmark with a black 
> hole [1] on Cortex. E.g. on random numbers 4.8 ns/op --> 3.5 ns/op, 
> overhead is 2.9 ns/op.
> 
> rfe: https://bugs.openjdk.java.net/browse/JDK-8251525
> webrev: http://cr.openjdk.java.net/~dchuyko/8251525/webrev.00/
> testing: jck, jtreg including new dedicated test
> 
> -Dmitry
> 
> [1] https://cr.openjdk.java.net/~dchuyko/8249198/DoubleSignum.java
>