Math.round optimization, and round to int
Andrew Haley
aph at redhat.com
Mon Jun 6 09:48:49 UTC 2016
Hi,
On 05/06/16 19:17, Jeff Hain wrote:
>
> While playing around with Math.round(double) code,
> I found out that
>
> if (longBits < 0) {
> r = -r;
> }
>
> can be replaced with:
>
> long bitsSignum = (((longBits >> 63) << 1) + 1); // 2*0+1 = 1, or 2*-1+1 = -1
> r *= bitsSignum;
>
> which seems a bit faster, as one could expect due to less branching.
Not necessarily. I get this with the original:
cmp x10, #0x0
mov x13, xzr
sub x13, x13, x12
csel x10, x13, x12, lt
and with yours:
asr x12, x10, #63
lsl x12, x12, #1
add x12, x12, #0x1
mul x10, x10, x12
(This is AArch64, but x86 is similar.)
The pronblem is that most of the instructions in the former are single
cycle, but MUL has a five-cycle latency. And there is also a
conditional negate instruction which the C2 compiler isn't smart
enough at the moment to awlways generate, but we will fix that.
Andrew.
More information about the core-libs-dev
mailing list