RFR 8199843 : Optimize Integer/Long.highestOneBit()

Tue Mar 20 17:20:46 UTC 2018

Hi Peter!

On 3/20/18 6:05 AM, Peter Levart wrote:
> Hi Ivan,
>
> What about branch-less variant?
>
> public static int highestOneBit(int i) {
>     return i & (MIN_VALUE >>> numberOfLeadingZeros(i));
> }
>
Nice variant!

I tried to run it, but the numbers are non-distinguishable for non-zero 
arguments.
And my variant performs slightly better with zero argument.
So, I think it's reasonable to keep the variant with the ternary operator.

Here are the results:
Benchmark                           (arg)   Mode  Cnt    Score Error   Units
MyBenchmark.int_testMethod_00_base      0  thrpt   35  352.509 ± 20.796  
ops/us
MyBenchmark.int_testMethod_00_base     42  thrpt   35 362.955 ±  7.278  
ops/us
MyBenchmark.int_testMethod_00_base    -42  thrpt   35 366.084 ±  6.770  
ops/us
MyBenchmark.int_testMethod_01_org       0  thrpt   35 249.340 ±  1.981  
ops/us
MyBenchmark.int_testMethod_01_org      42  thrpt   35 249.007 ±  2.005  
ops/us
MyBenchmark.int_testMethod_01_org     -42  thrpt   35 241.866 ±  4.759  
ops/us
MyBenchmark.int_testMethod_02_new       0  thrpt   35 328.389 ±  5.883  
ops/us
MyBenchmark.int_testMethod_02_new      42  thrpt   35 306.381 ±  5.505  
ops/us
MyBenchmark.int_testMethod_02_new     -42  thrpt   35 300.328 ±  5.644  
ops/us
MyBenchmark.int_testMethod_03_and       0  thrpt   35 301.559 ±  5.453  
ops/us
MyBenchmark.int_testMethod_03_and      42  thrpt   35  299.694 ± 5.000  
ops/us
MyBenchmark.int_testMethod_03_and     -42  thrpt   35  301.505 ± 5.664  
ops/us

The benchmark updated in place:
http://cr.openjdk.java.net/~igerasim/8199843/00/MyBenchmark.java

With kind regards
Ivan

> Would it be any better for call sites that vary 0 and non-0 argument?
>
> Regards, Peter
>
> On 03/20/2018 09:58 AM, Ivan Gerasimov wrote:
>> Hello!
>>
>> The hightestOneBit function doesn't have an intrinsic and is 
>> currently implemented with a dozen of instructions.
>> Alternatively, it could be implemented as MIN_VALUE >>> 
>> numberOfLeadingZeros(i), which works for all integers but zero.
>> The former function gets intrisified by hotspot, which results in 
>> +27% of throughput (see the jmh results below).
>>
>> Would you please help review this simple fix?
>>
>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8199843
>> WEBREV: http://cr.openjdk.java.net/~igerasim/8199843/00/webrev/
>> Benchmark: 
>> http://cr.openjdk.java.net/~igerasim/8199843/00/MyBenchmark.java
>>
>> Benchmark results:
>>
>> Benchmark                        (arg)   Mode  Cnt Score Error Units
>> MyBenchmark.int_testMethod_new       0  thrpt   35 323430664.593 ±  
>> 7492044.171  ops/s
>> MyBenchmark.int_testMethod_new      42  thrpt   35 298526237.078 ±  
>> 5978291.689  ops/s
>> MyBenchmark.int_testMethod_new     -42  thrpt   35 302903562.073 ±  
>> 7984723.721  ops/s
>> MyBenchmark.int_testMethod_org       0  thrpt   35 236245042.891 ±  
>> 3635990.596  ops/s
>> MyBenchmark.int_testMethod_org      42  thrpt   35 237903410.753 ±  
>> 3437684.390  ops/s
>> MyBenchmark.int_testMethod_org     -42  thrpt   35 238472580.618 ±  
>> 2654886.010  ops/s
>> MyBenchmark.long_testMethod_new      0  thrpt   35 282646114.501 ± 
>> 48028366.305  ops/s
>> MyBenchmark.long_testMethod_new     42  thrpt   35 282382228.405 ±  
>> 5781529.307  ops/s
>> MyBenchmark.long_testMethod_new    -42  thrpt   35 276724858.286 ±  
>> 6529561.227  ops/s
>> MyBenchmark.long_testMethod_org      0  thrpt   35 198500211.972 ± 
>> 15096862.367  ops/s
>> MyBenchmark.long_testMethod_org     42  thrpt   35 215854630.194 ±  
>> 3112930.563  ops/s
>> MyBenchmark.long_testMethod_org    -42  thrpt   35 217992805.521 ±  
>> 2622877.082  ops/s
>>
>> Thanks in advance!
>>
>

-- 
With kind regards,
Ivan Gerasimov