RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh
Vamsi Parasa
duke at openjdk.java.net
Fri Oct 15 19:34:00 UTC 2021
On Fri, 15 Oct 2021 16:14:25 GMT, Andrew Haley <aph at openjdk.org> wrote:
>> Optimize the new Math.unsignedMultiplyHigh using the x86 mul instruction. This change show 1.87X improvement on a micro benchmark.
>
> src/hotspot/share/opto/mulnode.cpp line 468:
>
>> 466: }
>> 467:
>> 468: //=============================================================================
>
> MulHiLNode::Value() and UMulHiLNode::Value() seem to be identical. Perhaps some refactoring would be in order, maybe make a common shared routine.
Sure, will do the refactoring to use a shared routine.
> test/micro/org/openjdk/bench/java/lang/MathBench.java line 547:
>
>> 545: return Math.unsignedMultiplyHigh(long747, long13);
>> 546: }
>> 547:
>
> As far as I can tell, the JMH overhead dominates when trying to measure the latency of events in the nanosecond range. `unsignedMultiplyHigh` should have a latency of maybe 1.5-2ns. Is that what you saw?
Yes, the JMH overhead was dominating the measurement of latency. The latency observed for `unsignedMultiplyHigh` was 2.3ns with the intrinsic enabled.
-------------
PR: https://git.openjdk.java.net/jdk/pull/5933
More information about the core-libs-dev
mailing list