[10] RFR(S): 8187684 - Intrinsify Math.multiplyHigh(long, long)
Dmitrij Pochepko
dmitrij.pochepko at bell-sw.com
Mon Sep 25 15:44:06 UTC 2017
On 25.09.2017 18:09, Andrew Haley wrote:
> On 20/09/17 14:08, Dmitrij Pochepko wrote:
>> I've created a small JMH benchmark:
>> http://cr.openjdk.java.net/~dpochepk/8187684/MultiplyHighBench.java to
>> test the improved performance and measured it on aarch64(t88, R-Pi) and
>> x86_64(i7-4770K). Benchmark shows about x2.5 improvement on aarch64 and
>> about x2 on x86_64
> By the way, this benchmark:
>
> for (int i = 0; i < 100; i++) {
> op1 = Math.multiplyHigh(op1, op2++);
> }
> return Math.multiplyHigh(op1, op2);
>
> measures the latency of the multiplyHigh, not the throughput, because
> each iteration depends on the previous one. I don't know if that was
> your intent, but I would imagine we're more interested in throughput.
> Fast processors can issue a mulh every few clock cycles, but their
> latency may considerably longer.
>
You're right. I've changed benchmark to:
long op = System.currentTimeMillis();
long accum = 0;
for (int i = 0; i < 10000; i++) {
accum += Math.multiplyHigh(op + i, op + i);
}
return accum;
and it shows even more improvement. about x3.5 on aarch64.
Thank you for noticing.
More information about the hotspot-compiler-dev
mailing list