[10] RFR: 8186915 - AARCH64: Intrinsify squareToLen and mulAdd
Andrew Haley
aph at redhat.com
Thu Aug 31 17:27:39 UTC 2017
On 31/08/17 14:39, Dmitrij Pochepko wrote:
> please review a patch for "8186915 - AARCH64: Intrinsify squareToLen and
> mulAdd" which adds respective intrinsics.
>
> webrev: http://cr.openjdk.java.net/~dpochepk/8186915/webrev.01/
> CR: https://bugs.openjdk.java.net/browse/JDK-8186915
>
> With these intrinsics implemented I see 8% improvement in specjvm2008
> crypto.rsa: 2333.13 ops/m vs 2520.11 ops/m.
I don't see anything like that. I see an improvement of 1.6%, which is
what I'd expect, given this profile:
samples cum. samples % cum. % symbol name
31866443 31866443 59.7443 59.7443 montgomerySquare
6125600 37992043 11.4845 71.2287 montgomeryMultiply
4036511 42028554 7.5678 78.7965 java.math.MutableBigInteger java.math.MutableBigInteger.divideMagnitude(java.math.MutableBigInteger, java.math.MutableBigInteger, boolean)~1
2056787 44085341 3.8561 82.6527 java.math.BigInteger java.math.BigInteger.oddModPow(java.math.BigInteger, java.math.BigInteger)~2
1145996 45231337 2.1486 84.8012 Ljava/math/MutableBigInteger;divideMagnitude(Ljava/math/MutableBigInteger;Ljava/math/MutableBigInteger;Z)Ljava/math/MutableBigInteger;%32
1140132 46371469 2.1376 86.9388 int[] java.math.BigInteger.montReduce(int[], int[], int, int)~2
558960 46930429 1.0480 87.9867 java.security.Provider$Service java.security.Provider.getService(java.lang.String, java.lang.String)
after your patch, I get:
samples cum. samples % cum. % symbol name
32574982 32574982 60.3583 60.3583 montgomerySquare
6196936 38771918 11.4823 71.8407 montgomeryMultiply
5103970 43875888 9.4572 81.2978 java.math.MutableBigInteger java.math.MutableBigInteger.divideMagnitude(java.math.MutableBigInteger, java.math.MutableBigInteger, boolean)
1991144 45867032 3.6894 84.9872 java.math.BigInteger java.math.BigInteger.oddModPow(java.math.BigInteger, java.math.BigInteger)~1
792336 46659368 1.4681 86.4554 mulAdd
586130 47245498 1.0860 87.5414 java.security.Provider$Service java.security.Provider.getService(java.lang.String, java.lang.String)
So we're seeing a boost to the performance of BigInteger.montReduce,
which is dominated by mulAdd, which makes sense, but it's not a very
large part of the total.
Your mul_add routine is less efficient than it should be. It uses
32-bit multiply operations when it could use 64-bit ones, just as the
multiply_to_len does. Your square_to_len routine has the same
problem.
There is an x86 example of how square_to_len should be done.
--
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the hotspot-compiler-dev
mailing list