Performance of FloatVector::pow vs. equivalent FloatVector::mul (oracle-jdk-19.0.2, Intel i7 8700B)

Paul Sandoz paul.sandoz at oracle.com
Tue Feb 28 20:07:54 UTC 2023


Hi Alex,

The performance difference you observe is because the pow operation is falling back to scalar code (Math.pow on each lane element) and not using vector instructions.

On x86 linux or windows you should observe better performance of the pow operation because it should leverage code from Intel’s Short Vector Math Library [1], but that code OS specific and is not currently ported on Mac OS.

Paul.

[1] 
https://github.com/openjdk/jdk/tree/master/src/jdk.incubator.vector/linux/native/libjsvml
https://github.com/openjdk/jdk/tree/master/src/jdk.incubator.vector/windows/native/libjsvml

> On Feb 26, 2023, at 8:01 PM, Alex K <aklibisz at gmail.com> wrote:
> 
> Hello,
> 
> I have a question, possibly a bug, to ask/report regarding performance with the jdk.incubator.vector.FloatVector class.
> 
> Specifically, given a FloatVector fv, I've found that calling fv.mul(fv) is ~40x faster than calling fv.pow(2)
> 
> Here is an example JMH benchmark: https://github.com/alexklibisz/site-projects/blob/782dcd53d3ee09c93f65b660c8ed4fd030a8889a/jdk-incubator-vector-optimizations/src/main/scala/com/alexklibisz/BenchPowVsMul.scala
> 
> The results look like this, on my 2018 Mac Mini, w/ Intel i7 8700B, running oracle-jdk-19.0.2:
> 
> <image.png>
> 
> As far as I can tell, the two operations produce equivalent results, yet one is significantly faster than the other.
> 
> I'm eager to learn if this is expected, a regression, or something else.
> 
> Thanks,
> Alex Klibisz



More information about the panama-dev mailing list