RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh

Fri Oct 15 19:34:00 UTC 2021

On Fri, 15 Oct 2021 16:14:25 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> Optimize the new Math.unsignedMultiplyHigh using the x86 mul instruction. This change show 1.87X improvement on a micro benchmark.
>
> src/hotspot/share/opto/mulnode.cpp line 468:
> 
>> 466: }
>> 467: 
>> 468: //=============================================================================
> 
> MulHiLNode::Value() and UMulHiLNode::Value() seem to be identical. Perhaps some refactoring would be in order, maybe make a common shared routine.

Sure, will do the refactoring to use a shared routine.

> test/micro/org/openjdk/bench/java/lang/MathBench.java line 547:
> 
>> 545:         return  Math.unsignedMultiplyHigh(long747, long13);
>> 546:     }
>> 547: 
> 
> As far as I can tell, the JMH overhead dominates when trying to measure the latency of events in the nanosecond range. `unsignedMultiplyHigh` should have a latency of maybe 1.5-2ns. Is that what you saw?

Yes, the JMH overhead was dominating the measurement of latency. The latency observed for `unsignedMultiplyHigh` was 2.3ns with the intrinsic enabled.

-------------

PR: https://git.openjdk.java.net/jdk/pull/5933