Long.bitCount micro-optimization

Doug Lea dl at cs.oswego.edu
Mon May 8 20:29:04 UTC 2017


On 05/08/2017 02:14 PM, Isaac Levy wrote:
> Original message below:
>
> The JDK impl of bitCount can be improved -- though most users will get
> the hotspot intrinsic. The best source I could find for the suggestion
> below is page 195: http://support.amd.com/techdocs/25112.pdf
>

The int version differs from current implementation in that it uses
one multiply instead of two shift+adds. (Similarly for long.)

I wonder if there is any processor that does not already have a
bit-count instruction for which this is actually faster?
Some evidence either way would be helpful.

-Doug



> Cheers,
> Isaac Levy
>
>
> Proposed Long.bitCount and Integer.bitCount:
>
>
> public static int bitCount(long i)
> {
>     i -= (i >>> 1) & 0x5555555555555555L;
>     i = (i & 0x3333333333333333L) + ((i >>> 2) & 0x3333333333333333L);
>     i = (i + (i >>> 4)) & 0x0f0f0f0f0f0f0f0fL;
>     return (i * 0x0101010101010101L) >>> 56;
> }
>
>
> public static int bitCount(int i) {
>     i -= (i >>> 1) & 0x55555555;
>     i = (i & 0x33333333) + ((i >>> 2) & 0x33333333);
>     i = (i + (i >>> 4)) & 0x0f0f0f0f;
>     return (i * 0x01010101) >>> 24;
> }
>



More information about the core-libs-dev mailing list