[PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request

Tue Feb 19 10:03:49 UTC 2019

On 18/02/2019 19:17, B. Blaser wrote:
> The problem we have is that the API implementation has a real weakness
> when both inputs are zero (see [1]) and branch optimizations don't
> help much, we still end up with too many 'ucomisd'. But with
> favourable data (not too many zeros) and a good branch optimization,
> the API might be faster though.
> 
> This is exactly what I meant early in the first thread, see [2]; it's
> quite impossible to find an instruction sequence that will get all
> paths faster regardless of the data...
> But zeros are rather frequent and the intrinsic is still 2x faster
> with poor predictions.

Andrew is suggesting a different approach altogether. Instead of either
using the intrinsic all the time or not using it all the time he is
suggesting using it when the branch prediction data suggests it will be
effective.

> So, if I had to choose (but I'm not a Reviewer), I'd probably be to
> incorporate this intrinsic unless we find a realistic example showing
> an important regression.
I believe the above consideration renders that a false dilemma.

regards,

Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander