Vectorizing POPCNT (Long.bitCount)
Paul Sandoz
paul.sandoz at oracle.com
Tue Sep 6 18:47:41 UTC 2022
Hi Stefan,
We added, on supporting hardware, vectorized bit count to the soon to be released API in JDK 19.
https://download.java.net/java/early_access/jdk19/docs/api/jdk.incubator.vector/jdk/incubator/vector/VectorOperators.html#BIT_COUNT
You should be able to experiment with the EA build if you don’t want to wait:
https://jdk.java.net/19/
See here for an overview in the history section:
https://openjdk.org/jeps/426
Paul.
> On Sep 2, 2022, at 5:59 PM, Stefan Reich <stefan.reich.maker.of.eye at googlemail.com> wrote:
>
> Hey tthere,
>
> I'd like to parallelize this function:
>
> static int countDifferingBits(long[] array1, long[] array2) {
> int n = array1.length;
> int diff = 0;
> for (int i = 0; i < n; i++)
> diff += Long.bitCount(array1[i]^array2[i]);
> return diff;
> }
>
> The function calculates the difference between two shapes (pixel by pixel) and is the central piece of an image recognition I am making.
>
> Example for a bunch of shapes (these are all the shapes found in the extended latin alphabet, in fact): https://botcompany.de/images/1103149
> Example for a shape comparison: https://botcompany.de/images/1103151
>
> The routine above is already mega-fast as is (thanks to HotSpot!), I can compare two 256 byte arrays in 29 CPU cycles. That's less than one cycle per 64 bit!
>
> But I'd still like to try to vectorize it.
>
> However, I think the bitCount function will prevent this because it doesn't seem to exist in a vectorized form. Or does it? (That would be my main question here.)
>
> Many greetings,
> Stefan
>
> --
> == Gaz.AI ==
More information about the panama-dev
mailing list