[aarch64-port-dev ] RFR: 8134869: AARCH64: GHASH intrinsic is not optimal
Edward Nevill
edward.nevill at gmail.com
Wed Sep 2 08:52:51 UTC 2015
On Tue, 2015-09-01 at 15:45 +0100, Andrew Haley wrote:
>
> http://cr.openjdk.java.net/~aph/8134869-ghash-1/
>
> One other remark: the AES/GCM implementation has a lot of overhead.
> Some profile data (on x86) looks like this:
Looks good to me also. I have patched it and done a quick sanity check. I see ~14% additional improvement over Alexander's implementation bringing the overall improvement to slightly over 100%.
One minor point. The following seems to have suffered some whitespace mangling.
// Table vector lookup
-#define INSN(NAME, op) \
- void NAME(FloatRegister Vd, SIMD_Arrangement T, FloatRegister Vn, unsigned registers, FloatRegister Vm) { \
- starti; \
- assert(T == T8B || T == T16B, "invalid arrangement"); \
- assert(0 < registers && registers <= 4, "invalid number of registers"); \
- f(0, 31), f((int)T & 1, 30), f(0b001110000, 29, 21), rf(Vm, 16), f(0, 15); \
- f(registers - 1, 14, 13), f(op, 12),f(0b00, 11, 10), rf(Vn, 5), rf(Vd, 0); \
+#define INSN(NAME, op) \
+ void NAME(FloatRegister Vd, SIMD_Arrangement T, FloatRegister Vn, unsigned registers, FloatRegister Vm) { \
+ starti; \
+ assert(T == T8B || T == T16B, "invalid arrangement"); \
+ assert(0 < registers && registers <= 4, "invalid number of registers"); \
+ f(0, 31), f((int)T & 1, 30), f(0b001110000, 29, 21), rf(Vm, 16), f(0, 15); \
+ f(registers - 1, 14, 13), f(op, 12),f(0b00, 11, 10), rf(Vn, 5), rf(Vd, 0); \
}
It looks like the mangling is as a result of replacing spaces with TABs at some point. I cannot see any difference in the above other than whitespace.
All the best,
Ed.
More information about the aarch64-port-dev
mailing list