RFR 8203279 : Faster calculation of power of two

Ivan Gerasimov ivan.gerasimov at oracle.com
Thu May 17 20:44:01 UTC 2018


Thank you Claes for your review and the detailed analysis!


On 5/17/18 4:07 AM, Claes Redestad wrote:
> Shouldn't this be called "Faster rounding up to nearest power of two"?
>
Yes, it's more accurate.

> Patch looks OK to me, but I'd like to see numbers with the 
> numberOfLeadingZeros intrinsic
> disabled so that we ensure we're not incurring an unreasonable penalty 
> on platforms who don't
> have an intrinsic for this.
>
> Running your benchmark with the intrinsic disabled[1] on my machine I 
> see a 25-30% penalty
> with testNew relative to testOld, which is perhaps a bit toomuch for 
> comfort..
>
> So I took a look at profiles for numberOfLeadingZeros with the 
> intrinsic disabled and realized
> it might be possible to improve:
>
> diff -r de35abdfe5da src/java.base/share/classes/java/lang/Integer.java
> --- a/src/java.base/share/classes/java/lang/Integer.java Mon May 14 
> 16:21:08 2018 +0200
> +++ b/src/java.base/share/classes/java/lang/Integer.java Thu May 17 
> 12:44:53 2018 +0200
> @@ -1621,12 +1621,12 @@
>          // HD, Figure 5-6
>          if (i <= 0)
>              return i == 0 ? 32 : 0;
> -        int n = 1;
> -        if (i >>> 16 == 0) { n += 16; i <<= 16; }
> -        if (i >>> 24 == 0) { n +=  8; i <<=  8; }
> -        if (i >>> 28 == 0) { n +=  4; i <<=  4; }
> -        if (i >>> 30 == 0) { n +=  2; i <<=  2; }
> -        n -= i >>> 31;
> +        int n = 0;
> +        if (          i < 1 << 16) { n += 16; i <<= 16; }
> +        if (i >= 0 && i < 1 << 24) { n +=  8; i <<=  8; }
> +        if (i >= 0 && i < 1 << 28) { n +=  4; i <<=  4; }
> +        if (i >= 0 && i < 1 << 30) { n +=  2; i <<=  2; }
> +        if (i >= 0) n++;
>          return n;
>      }
>
> Adding a case that uses this version to your benchmark[2] and the new 
> version is only about
> 10% slower than the baseline, with the added benefit that other uses 
> of numberOfLeadingZeros
> might see a speed-upif there's no intrinsic (runs with intrinsic 
> disabled[1]):
>
Interesting.
It can probably be done with fewer comparisons, if the direction of all 
the shifts is inverted:

The following variant showed slightly better performance on my machine:

     static final int numberOfLeadingZeros(int i) {
         if (i <= 0)
             return i == 0 ? 32 : 0;
         int n = 31;
         if (i >= 1 << 16) { n -= 16; i >>>= 16; }
         if (i >= 1 <<  8) { n -=  8; i >>>=  8; }
         if (i >= 1 <<  4) { n -=  4; i >>>=  4; }
         if (i >= 1 <<  2) { n -=  2; i >>>=  2; }
         return n - (i >>> 1);
     }

I agree that improving Java implementation of numberOfLeadingZeros() can 
be done as a separate RFE.

With kind regards,
Ivan


> Benchmark          (arg)  Mode  Cnt   Score   Error Units
> TableFor.testNew       1  avgt    6  28.343 ± 0.534  ns/op
> TableFor.testNew      42  avgt    6  26.458 ± 0.064  ns/op
> TableFor.testNew2      1  avgt    6  23.969 ± 0.201  ns/op
> TableFor.testNew2     42  avgt    6  23.934 ± 0.107  ns/op
> TableFor.testOld       1  avgt    6  21.615 ± 0.803  ns/op
> TableFor.testOld      42  avgt    6  21.418 ± 0.106  ns/op
>
> So I think with the above patch to Integer.numberOfLeadingZeros we can 
> get the benefit for
> our supported platforms while staying roughly performance neutral on 
> platforms without
> an intrinsic. Not strictly necessary to fold it into this RFE, of course.
>
> Thanks!
>
> /Claes
>
> [1] -XX:+UnlockDiagnosticVMOptions 
> -XX:DisableIntrinsic=_numberOfLeadingZeros_i
> [2] http://cr.openjdk.java.net/~redestad/scratch/TableFor.java
>
> On 2018-05-17 05:48, David Holmes wrote:
>>> Do you think it's good to go?
>>
>> I think I'd rather defer to a more performance oriented reviewer - 
>> paging Claes! ;-)
>>
>> David
>> ----- 
>

-- 
With kind regards,
Ivan Gerasimov



More information about the core-libs-dev mailing list