RFR: 8350468: x86: Improve implementation of vectorized numberOfLeadingZeros for int and long

Sandhya Viswanathan sviswanathan at openjdk.org
Wed Sep 24 22:03:37 UTC 2025


On Mon, 4 Aug 2025 02:20:31 GMT, Jasmine Karthikeyan <jkarthikeyan at openjdk.org> wrote:

> Hi all,
> This is a patch that optimizes the x86 backend implementation of `CountLeadingZerosV` for int and long. In the review of [JDK-8349637)](https://bugs.openjdk.org/browse/JDK-8349637) an [optimized algorithm]( https://github.com/openjdk/jdk/pull/23579#issuecomment-2661332497) was proposed by @rgiulietti, which this PR implements. For integer operands, the optimized algorithm reduces the number of vector instructions from 19 to 13. The same algorithm does not work for long operands, however, since avx2 lacks a vectorized long->double conversion instruction. Instead, I found an optimized algorithm to reuse the code for int and compute the leading zeros for long with only 4 additional instructions. I added a benchmark and on my Zen 3 machine I get these results:
> 
>                                  Baseline                        Patch        
> Benchmark              Mode  Cnt    Score   Error  Units    Score   Error  Units  Improvement
> LeadingZeros.testInt   avgt   15   91.097 ± 3.276  ns/op   68.665 ± 1.740  ns/op  (+ 28.1%)
> LeadingZeros.testLong  avgt   15  342.545 ± 4.470  ns/op  228.668 ± 5.994  ns/op  (+ 39.9%)
> 
> I've updated the unit tests to more thoroughly test longs and they pass on my machine. Thoughts and reviews would be appreciated!

The PR looks good to me. Nice improvement. I have two minor comments.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 6289:

> 6287:   // Move the top half result to the bottom half of xtmp1, setting the top half to 0.
> 6288:   vpsrlq(xtmp1, dst, 32, vec_enc);
> 6289:   // By moving the top half result to the right by 6 bytes, if the top half was empty (i.e. 32 is returned) the result bit will

I think you mean 6 bits here and not 6 bytes.

test/hotspot/jtreg/compiler/vectorization/TestNumberOfContinuousZeros.java line 49:

> 47: 
> 48: public class TestNumberOfContinuousZeros {
> 49:     private static final int[] SPECIAL_INT = { 0, 0x01FFFFFF, 0x03FFFFFE, 0x07FFFFFC, 0x0FFFFFF8, 0x1FFFFFF0, 0x3FFFFFE0, 0xFFFFFFFF };

Please also update the copyright year for the file to 2025.

-------------

PR Review: https://git.openjdk.org/jdk/pull/26610#pullrequestreview-3259874081
PR Review Comment: https://git.openjdk.org/jdk/pull/26610#discussion_r2373662506
PR Review Comment: https://git.openjdk.org/jdk/pull/26610#discussion_r2377150533


More information about the hotspot-compiler-dev mailing list