RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v7]
Vladimir Kozlov
kvn at openjdk.java.net
Thu Jan 6 20:59:22 UTC 2022
On Thu, 6 Jan 2022 19:57:01 GMT, Vamsi Parasa <duke at openjdk.java.net> wrote:
>> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization.
>
> Vamsi Parasa has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision:
>
> - Merge branch 'master' of https://git.openjdk.java.net/jdk into vlong
> - Update IR framework test to check for non-zero count of PopCountVL
> - Update year in copyright notice. Add avx512dq check for vectorization test
> - Update popcount long test to use IR framework
> - Use generic vector node names
> - Add JMH micro benchmark to measure performance
> - 8278868:Add x86 vectorization support for Long.bitCount()
Build error:
workspace/open/src/hotspot/share/opto/superword.cpp:2556:38: error: converting the enum constant to a boolean [-Werror,-Wint-in-bool-context]
opc == Op_PopCountI || Op_PopCountL) {
And a lot of testing errors with missing `VectorCastI2X` and `VectorCastL2X` (compiler/codegen/TestLongDoubleVect.java, compiler/codegen/TestIntFloatVect.java):
# Internal Error (/workspace/open/src/hotspot/share/opto/vectornode.cpp:573), pid=11893, tid=11909
# fatal error: Missed vector creation for 'VectorCastI2X'
#
# Problematic frame:
# V [libjvm.so+0x19d2bd3] VectorNode::make(int, Node*, Node*, TypeVect const*, bool, bool) [clone .part.0]+0x63
Current CompileTask:
C2: 674 58 % b compiler.codegen.TestIntFloatVect::test_conv_i2f @ 2 (22 bytes)
Stack: [0x00007f0fa59fa000,0x00007f0fa5afb000], sp=0x00007f0fa5af4e70, free space=1003k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x19d2bd3] VectorNode::make(int, Node*, Node*, TypeVect const*, bool, bool) [clone .part.0]+0x63
V [libjvm.so+0x189e082] SuperWord::output()+0xb82
V [libjvm.so+0x18a40e0] SuperWord::transform_loop(IdealLoopTree*, bool)+0x400
V [libjvm.so+0x13a2284] PhaseIdealLoop::build_and_optimize(LoopOptsMode)+0xff4
V [libjvm.so+0xa9a0fa] PhaseIdealLoop::optimize(PhaseIterGVN&, LoopOptsMode)+0x28a
V [libjvm.so+0xa963df] Compile::Optimize()+0x102f
V [libjvm.so+0xa9863e] Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x159e
-------------
PR: https://git.openjdk.java.net/jdk/pull/6857
More information about the hotspot-compiler-dev
mailing list