[aarch64-port-dev ] RFR: 8134869: AARCH64: GHASH intrinsic is not optimal

Edward Nevill edward.nevill at gmail.com
Wed Sep 2 08:52:51 UTC 2015


On Tue, 2015-09-01 at 15:45 +0100, Andrew Haley wrote:

> 
> http://cr.openjdk.java.net/~aph/8134869-ghash-1/
> 
> One other remark: the AES/GCM implementation has a lot of overhead.
> Some profile data (on x86) looks like this:

Looks good to me also. I have patched it and done a quick sanity check. I see ~14% additional improvement over Alexander's implementation bringing the overall improvement to slightly over 100%.

One minor point. The following seems to have suffered some whitespace mangling.

   // Table vector lookup
-#define INSN(NAME, op)                                                                                       \
-  void NAME(FloatRegister Vd, SIMD_Arrangement T, FloatRegister Vn, unsigned registers, FloatRegister Vm) {  \
-    starti;                                                                                                  \
-    assert(T == T8B || T == T16B, "invalid arrangement");                                                    \
-    assert(0 < registers && registers <= 4, "invalid number of registers");                                  \
-    f(0, 31), f((int)T & 1, 30), f(0b001110000, 29, 21), rf(Vm, 16), f(0, 15);                               \
-    f(registers - 1, 14, 13), f(op, 12),f(0b00, 11, 10), rf(Vn, 5), rf(Vd, 0);                               \
+#define INSN(NAME, op)                                                 \
+  void NAME(FloatRegister Vd, SIMD_Arrangement T, FloatRegister Vn, unsigned registers, FloatRegister Vm) { \
+    starti;                                                            \
+    assert(T == T8B || T == T16B, "invalid arrangement");              \
+    assert(0 < registers && registers <= 4, "invalid number of registers"); \
+    f(0, 31), f((int)T & 1, 30), f(0b001110000, 29, 21), rf(Vm, 16), f(0, 15); \
+    f(registers - 1, 14, 13), f(op, 12),f(0b00, 11, 10), rf(Vn, 5), rf(Vd, 0); \
   }


It looks like the mangling is as a result of replacing spaces with TABs at some point. I cannot see any difference in the above other than whitespace.


All the best,
Ed.




More information about the aarch64-port-dev mailing list