RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v4]

Sandhya Viswanathan sviswanathan at openjdk.java.net
Wed Jan 5 17:47:16 UTC 2022


On Wed, 5 Jan 2022 04:04:03 GMT, Vamsi Parasa <duke at openjdk.java.net> wrote:

>> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization.
>
> Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update popcount long test to use IR framework

src/hotspot/cpu/x86/x86.ad line 1416:

> 1414:         return false;
> 1415:       }
> 1416:       break;

This case could be combined with case Op_PopCountVI and duplication removed. The check is the same for both.

test/hotspot/jtreg/compiler/vectorization/TestPopCountVectorLong.java line 57:

> 55:     @Test // needs to be run in (fast) debug mode
> 56:     @Warmup(10000)
> 57:     @IR(counts = {"PopCountVL", "9"}) //9 PopCountVL nodes are generated for a long[] of LEN=1024

Could this be a failOn check instead of counts check? The number of PopCountVL nodes is dependent on loop unrolling which keeps changing with loop optimizations.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6857


More information about the hotspot-dev mailing list