Math.round optimization, and round to int

Mon Jun 6 09:48:49 UTC 2016

Hi,

On 05/06/16 19:17, Jeff Hain wrote:

> 
> While playing around with Math.round(double) code,
> I found out that
> 
> if (longBits < 0) {
>     r = -r;
> }
> 
> can be replaced with:
> 
> long bitsSignum = (((longBits >> 63) << 1) + 1); // 2*0+1 = 1, or 2*-1+1 = -1
> r *= bitsSignum;
> 
> which seems a bit faster, as one could expect due to less branching.

Not necessarily.  I get this with the original:

   cmp       x10, #0x0
   mov       x13, xzr
   sub       x13, x13, x12
   csel      x10, x13, x12, lt

and with yours:

   asr       x12, x10, #63
   lsl       x12, x12, #1
   add       x12, x12, #0x1
   mul       x10, x10, x12

(This is AArch64, but x86 is similar.)

The pronblem is that most of the instructions in the former are single
cycle, but MUL has a five-cycle latency.  And there is also a
conditional negate instruction which the C2 compiler isn't smart
enough at the moment to awlways generate, but we will fix that.

Andrew.