Vectorizing POPCNT (Long.bitCount)

Paul Sandoz paul.sandoz at oracle.com
Tue Sep 6 18:47:41 UTC 2022


Hi Stefan,

We added, on supporting hardware, vectorized bit count to the soon to be released API in JDK 19.

  https://download.java.net/java/early_access/jdk19/docs/api/jdk.incubator.vector/jdk/incubator/vector/VectorOperators.html#BIT_COUNT

You should be able to experiment with the EA build if you don’t want to wait:

  https://jdk.java.net/19/

See here for an overview in the history section:

  https://openjdk.org/jeps/426

Paul.

> On Sep 2, 2022, at 5:59 PM, Stefan Reich <stefan.reich.maker.of.eye at googlemail.com> wrote:
> 
> Hey tthere,
> 
> I'd like to parallelize this function:
> 
> static int countDifferingBits(long[] array1, long[] array2) {
>   int n = array1.length;
>   int diff = 0;
>   for (int i = 0; i < n; i++)
>     diff += Long.bitCount(array1[i]^array2[i]);
>   return diff;
> }
> 
> The function calculates the difference between two shapes (pixel by pixel) and is the central piece of an image recognition I am making.
> 
> Example for a bunch of shapes (these are all the shapes found in the extended latin alphabet, in fact): https://botcompany.de/images/1103149
> Example for a shape comparison: https://botcompany.de/images/1103151
> 
> The routine above is already mega-fast as is (thanks to HotSpot!), I can compare two 256 byte arrays in 29 CPU cycles. That's less than one cycle per 64 bit!
> 
> But I'd still like to try to vectorize it.
> 
> However, I think the bitCount function will prevent this because it doesn't seem to exist in a vectorized form. Or does it? (That would be my main question here.)
> 
> Many greetings,
> Stefan
> 
> -- 
> == Gaz.AI ==



More information about the panama-dev mailing list