RFR: 8278868: Add x86 vectorization support for Long.bitCount() [v9]

Quan Anh Mai duke at openjdk.java.net
Mon Jan 10 06:36:24 UTC 2022


On Mon, 10 Jan 2022 06:05:54 GMT, Vamsi Parasa <duke at openjdk.java.net> wrote:

>> Vectorization support of Integer.bitCount() already exists but currently the same support is lacking for Long.bitCount(). Similar to the C2 PopCountVI node, we created a C2 PopCountVL node and used vpopcntq x86 instruction to enable vectorized Long.bitCount(). This patch shows 2.57x improvement in performance on a JMH micro benchmark due to x86 vectorization.
>
> Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:
> 
>   aditional checks for the test

src/hotspot/cpu/x86/x86.ad line 8602:

> 8600:     int vlen_enc = vector_length_encoding(this, $src);
> 8601:     __ vpopcntq($dst$$XMMRegister, $src$$XMMRegister, vlen_enc);
> 8602:     __ evpmovqd($dst$$XMMRegister, $dst$$XMMRegister, vlen_enc);

Hi,
Should this cast be introduced at the middle-end instead? Popcount is a lane-wise operation and forcing the node to do a shape-changing operation seems not so reasonable.
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6857


More information about the hotspot-compiler-dev mailing list