[10] RFR: 8186915 - AARCH64: Intrinsify squareToLen and mulAdd

Andrew Haley aph at redhat.com
Thu Aug 31 17:27:39 UTC 2017


On 31/08/17 14:39, Dmitrij Pochepko wrote:
> please review a patch for "8186915 - AARCH64: Intrinsify squareToLen and 
> mulAdd" which adds respective intrinsics.
> 
> webrev: http://cr.openjdk.java.net/~dpochepk/8186915/webrev.01/
> CR: https://bugs.openjdk.java.net/browse/JDK-8186915
> 
> With these intrinsics implemented I see 8% improvement in specjvm2008 
> crypto.rsa: 2333.13 ops/m vs 2520.11 ops/m.

I don't see anything like that.  I see an improvement of 1.6%, which is
what I'd expect, given this profile:

samples  cum. samples  %        cum. %  symbol name
31866443 31866443      59.7443  59.7443 montgomerySquare
6125600  37992043      11.4845  71.2287 montgomeryMultiply
4036511  42028554       7.5678  78.7965 java.math.MutableBigInteger java.math.MutableBigInteger.divideMagnitude(java.math.MutableBigInteger, java.math.MutableBigInteger, boolean)~1
2056787  44085341       3.8561  82.6527 java.math.BigInteger java.math.BigInteger.oddModPow(java.math.BigInteger, java.math.BigInteger)~2
1145996  45231337       2.1486  84.8012 Ljava/math/MutableBigInteger;divideMagnitude(Ljava/math/MutableBigInteger;Ljava/math/MutableBigInteger;Z)Ljava/math/MutableBigInteger;%32
1140132  46371469       2.1376  86.9388 int[] java.math.BigInteger.montReduce(int[], int[], int, int)~2
558960   46930429       1.0480  87.9867 java.security.Provider$Service java.security.Provider.getService(java.lang.String, java.lang.String)

after your patch, I get:

samples  cum. samples  %        cum. %  symbol name
32574982 32574982      60.3583  60.3583 montgomerySquare
6196936  38771918      11.4823  71.8407 montgomeryMultiply
5103970  43875888       9.4572  81.2978 java.math.MutableBigInteger java.math.MutableBigInteger.divideMagnitude(java.math.MutableBigInteger, java.math.MutableBigInteger, boolean)
1991144  45867032       3.6894  84.9872 java.math.BigInteger java.math.BigInteger.oddModPow(java.math.BigInteger, java.math.BigInteger)~1
792336   46659368       1.4681  86.4554 mulAdd
586130   47245498       1.0860  87.5414 java.security.Provider$Service java.security.Provider.getService(java.lang.String, java.lang.String)

So we're seeing a boost to the performance of BigInteger.montReduce,
which is dominated by mulAdd, which makes sense, but it's not a very
large part of the total.

Your mul_add routine is less efficient than it should be.  It uses
32-bit multiply operations when it could use 64-bit ones, just as the
multiply_to_len does.  Your square_to_len routine has the same
problem.

There is an x86 example of how square_to_len should be done.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


More information about the hotspot-compiler-dev mailing list