[PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request
B. Blaser
bsrbnd at gmail.com
Mon Feb 18 16:09:45 UTC 2019
On Mon, 18 Feb 2019 at 16:37, Andrew Haley <aph at redhat.com> wrote:
>
> On 2/18/19 1:26 PM, B. Blaser wrote:
> >
> > Intrinsic instruction sequences are definitely fast and other
> > optimizations can benefit from their mathematical properties.
>
> Yes, they can be.
>
> > Of course, statistical optimizations could be even faster but making
> > assumptions about predictability to exclude intrinsics is rather
> > dangerous.
>
> I'm not convinced that it is at all dangerous. The pattern I
> illustrated is uncommon, and might will be considerably more common
> than the pattern than the benchmark presented by Jatin. But we should
> not choose our benchmarks so that they make our code look
> good. Instead, we should use benchmarks to help us decide what to do.
>
> > The JVM should be able to decide dynamically whether to use intrinsics
> > or not depending on the reliability of its statistics?!
>
> Perhaps so, yes. So before we decide to commit changes that may well make the
> JVM worse on many (most?) workloads, we should find a way to do that.
Yes and no, simply try your example with unfavourable data:
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class FpMinMaxIntrinsics {
private static final int COUNT = 1000;
private float[] floats = new float[COUNT];
private Random r = new Random();
@Setup
public void init() {
for (int i=0; i<COUNT; i++) {
if (i % 2 == 0)
floats[i] = r.nextFloat();
else
floats[i] = -0.0f;
}
}
@Benchmark
public float fMinReduce() {
float result = Float.MAX_VALUE;
for (int i=0; i<COUNT; i++)
result = Math.min(result, floats[i]);
return result;
}
}
With the intrinsic:
Benchmark Mode Cnt Score Error Units
FpMinMaxIntrinsics.fMinReduce avgt 2386.708 ns/op
Without:
Benchmark Mode Cnt Score Error Units
FpMinMaxIntrinsics.fMinReduce avgt 14042.155 ns/op
The execution time of the intrinsic will always be stable and you'll
never have such performance drop-down.
Bernard
More information about the hotspot-compiler-dev
mailing list