RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v4]
Sandhya Viswanathan
sviswanathan at openjdk.java.net
Wed Jan 5 17:47:16 UTC 2022
On Wed, 5 Jan 2022 04:04:03 GMT, Vamsi Parasa <duke at openjdk.java.net> wrote:
>> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization.
>
> Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:
>
> Update popcount long test to use IR framework
src/hotspot/cpu/x86/x86.ad line 1416:
> 1414: return false;
> 1415: }
> 1416: break;
This case could be combined with case Op_PopCountVI and duplication removed. The check is the same for both.
test/hotspot/jtreg/compiler/vectorization/TestPopCountVectorLong.java line 57:
> 55: @Test // needs to be run in (fast) debug mode
> 56: @Warmup(10000)
> 57: @IR(counts = {"PopCountVL", "9"}) //9 PopCountVL nodes are generated for a long[] of LEN=1024
Could this be a failOn check instead of counts check? The number of PopCountVL nodes is dependent on loop unrolling which keeps changing with loop optimizations.
-------------
PR: https://git.openjdk.java.net/jdk/pull/6857
More information about the hotspot-dev
mailing list