[10] RFR: 8186915 - AARCH64: Intrinsify squareToLen and mulAdd
Dmitrij Pochepko
dmitrij.pochepko at bell-sw.com
Wed Sep 20 14:13:11 UTC 2017
Hi,
Andrew, do you believe this is ok to push?
Thanks,
Dmitrij
....
On 06.09.2017 20:39, Dmitrij wrote:
>
> I've compared it by calling square and multiply methods and got
> following results(ThunderX):
>
>
> Benchmark (size, ints) Mode
> Cnt Score Error Units
> BigIntegerBench.implMutliplyToLenReflect 1 avgt 5 186.930 ?
> 14.933 ns/op (26% slower)
> BigIntegerBench.implMutliplyToLenReflect 2 avgt 5 194.095 ?
> 11.857 ns/op (12% slower)
> BigIntegerBench.implMutliplyToLenReflect 3 avgt 5 233.912
> ? 4.229 ns/op (24% slower)
> BigIntegerBench.implMutliplyToLenReflect 5 avgt 5 308.349 ?
> 20.383 ns/op (22% slower)
> BigIntegerBench.implMutliplyToLenReflect 10 avgt 5 475.839
> ? 6.232 ns/op (same)
> BigIntegerBench.implMutliplyToLenReflect 50 avgt 5 6514.691
> ? 76.934 ns/op (same)
> BigIntegerBench.implMutliplyToLenReflect 90 avgt 5 20347.040
> ? 224.290 ns/op (3% slower)
> BigIntegerBench.implMutliplyToLenReflect 127 avgt 5 41929.302
> ? 181.053 ns/op (9% slower)
>
> BigIntegerBench.implSquareToLenReflect 1 avgt 5 147.751 ?
> 12.760 ns/op
> BigIntegerBench.implSquareToLenReflect 2 avgt 5 173.804
> ? 4.850 ns/op
> BigIntegerBench.implSquareToLenReflect 3 avgt 5 187.822 ?
> 34.027 ns/op
> BigIntegerBench.implSquareToLenReflect 5 avgt 5 251.995 ?
> 19.711 ns/op
> BigIntegerBench.implSquareToLenReflect 10 avgt 5 474.489
> ? 1.040 ns/op
> BigIntegerBench.implSquareToLenReflect 50 avgt 5 6493.768
> ? 33.809 ns/op
> BigIntegerBench.implSquareToLenReflect 90 avgt 5 19766.524
> ? 88.398 ns/op
> BigIntegerBench.implSquareToLenReflect 127 avgt 5 38448.202
> ? 180.095 ns/op
>
>
> As we can see, squareToLen is faster than multiplyToLen.
>
> (I've updated benchmark code at
> http://cr.openjdk.java.net/~dpochepk/8186915/BigIntegerBench.java)
>
> Thanks,
> Dmitrij
More information about the hotspot-compiler-dev
mailing list