RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v7]

Vladimir Kozlov kvn at openjdk.java.net
Thu Jan 6 20:59:22 UTC 2022


On Thu, 6 Jan 2022 19:57:01 GMT, Vamsi Parasa <duke at openjdk.java.net> wrote:

>> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization.
>
> Vamsi Parasa has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision:
> 
>  - Merge branch 'master' of https://git.openjdk.java.net/jdk into vlong
>  - Update IR framework test to check for non-zero count of PopCountVL
>  - Update year in copyright notice. Add avx512dq check for vectorization test
>  - Update popcount long test to use IR framework
>  - Use generic vector node names
>  - Add JMH micro benchmark to measure performance
>  - 8278868:Add x86 vectorization support for Long.bitCount()

Build error:

workspace/open/src/hotspot/share/opto/superword.cpp:2556:38: error: converting the enum constant to a boolean [-Werror,-Wint-in-bool-context]
                  opc == Op_PopCountI || Op_PopCountL) {


And a lot of testing errors with missing `VectorCastI2X` and `VectorCastL2X` (compiler/codegen/TestLongDoubleVect.java, compiler/codegen/TestIntFloatVect.java):

#  Internal Error (/workspace/open/src/hotspot/share/opto/vectornode.cpp:573), pid=11893, tid=11909
#  fatal error: Missed vector creation for 'VectorCastI2X'
#
# Problematic frame:
# V  [libjvm.so+0x19d2bd3]  VectorNode::make(int, Node*, Node*, TypeVect const*, bool, bool) [clone .part.0]+0x63

Current CompileTask:
C2:    674   58 %  b        compiler.codegen.TestIntFloatVect::test_conv_i2f @ 2 (22 bytes)

Stack: [0x00007f0fa59fa000,0x00007f0fa5afb000],  sp=0x00007f0fa5af4e70,  free space=1003k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x19d2bd3]  VectorNode::make(int, Node*, Node*, TypeVect const*, bool, bool) [clone .part.0]+0x63
V  [libjvm.so+0x189e082]  SuperWord::output()+0xb82
V  [libjvm.so+0x18a40e0]  SuperWord::transform_loop(IdealLoopTree*, bool)+0x400
V  [libjvm.so+0x13a2284]  PhaseIdealLoop::build_and_optimize(LoopOptsMode)+0xff4
V  [libjvm.so+0xa9a0fa]  PhaseIdealLoop::optimize(PhaseIterGVN&, LoopOptsMode)+0x28a
V  [libjvm.so+0xa963df]  Compile::Optimize()+0x102f
V  [libjvm.so+0xa9863e]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x159e

-------------

PR: https://git.openjdk.java.net/jdk/pull/6857


More information about the hotspot-compiler-dev mailing list