[aarch64-port-dev ] [16] RFR(S): 8251525: AARCH64: Faster Math.signum(fp)

Andrew Haley aph at redhat.com
Sun Aug 30 08:34:57 UTC 2020


On 28/08/2020 17:40, Hohensee, Paul wrote:

> One's perspective on the benchmark results depends on the expected
> frequency of the input types. If we don't expect frequent NaNs (I
> don’t, because they mean your algorithm is numerically unstable and
> you're wasting your time running it), or zeros (somewhat arguable,
> but note that most codes go to some lengths to eliminate zeros,
> e.g., using sparse arrays), then this patch seems to me to be a win.

Possibly. But it's a significant change that improves some cases while
making some other cases worse. When it does makes some cases better,
it's only by a small factor and it's not consistent across hardware
implementations.

Please consider the numbers. When you look at Abs/Copysign it improves
all cases except 0, and it doesn't make any of them any worse.
Copysign on its own gets close. Copysign is nearly as good. That's
true at least for the reduce case, which I argue is representative,
more so than the blackhole case, where the blackhole operation itself
swamps the calculation we're trying to measure.

Ignoring NaN, I've added averages for the four cases to
http://cr.openjdk.java.net/~aph/signum-facgt-copysign.ods.

But we still don't know what effect all of this has, if any, on real
code. My guess is that copysign should always helps because it avoids
a move between FPU and integer unit and is otherwise identical. But
the blackhole benchmark suggests it can make latency worse, and I have
no explanation for that.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671



More information about the hotspot-compiler-dev mailing list