RFR: 8349637: Integer.numberOfLeadingZeros outputs incorrectly in certain cases [v3]

Emanuel Peter epeter at openjdk.org
Fri Feb 14 13:29:11 UTC 2025


On Fri, 14 Feb 2025 12:32:46 GMT, Tobias Hartmann <thartmann at openjdk.org> wrote:

>> Jasmine Karthikeyan has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Improve explanation of logic
>
> And a similar issue with byte-size vectors:
> 
> 
> // Run with java -Xbatch -XX:-TieredCompilation TestByte.java
> 
> public class TestByte {
> 
>     public static void test() {
>         byte[] vals = new byte[1024];
>         byte[] results = new byte[1024];
>         for (int i = 0; i < 1024; ++i) {
>             results[i] = (byte)Integer.numberOfLeadingZeros(vals[i]);
>         }
>         for (int i = 0; i < 1024; ++i) {
>             if (results[i] != 32) throw new RuntimeException("Wrong result");
>         }
>     }
> 
>     public static void main(String[] args) {
>         for (int i = 0; i < 10_000; ++i) {
>             test();
>         }
>     }
> }

@TobiHartmann 
> I also noticed that the code does not vectorize without the (short) cast, i.e., when using an integer result array.

Right, this currently is an auto-vectorizer limitation, that @jaskarth is working on here:
https://github.com/openjdk/jdk/pull/23413

Example:

public class TestBI {

    public static void test() {
        byte[] vals = new byte[1024];
        int[] results = new int[1024];
        for (int i = 0; i < 1024; ++i) {
            results[i] = Integer.numberOfLeadingZeros(vals[i]);
        }
        for (int i = 0; i < 1024; ++i) {
            if (results[i] != 32) throw new RuntimeException("Wrong result");
        }
    }

    public static void main(String[] args) {
        for (int i = 0; i < 10_000; ++i) {
            test();
        }
    }
}

Run with:
`java -Xbatch -XX:-TieredCompilation -XX:UseAVX=2 -XX:CompileCommand=compileonly,TestBI::test -XX:+TraceNewVectors -XX:CompileCommand=printassembly,TestBI::testx -XX:CompileCommand=TraceAutoVectorization,TestBI::test,ALL TestBI.java`

You see that auto-vectorization rejects the packs, because there is no cast from the `byte` to `int` packs:

WARNING: Removed pack: not profitable:
    0:  672  LoadB  === 708 57 673  [[ 671 ]]  @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=6; #byte !orig=568,438,184 !jvms: TestBI::test @ bci:25 (line 9)
    1:  686  LoadB  === 708 57 687  [[ 685 ]]  @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=6; #byte !orig=579,184 !jvms: TestBI::test @ bci:25 (line 9)
    2:  695  LoadB  === 708 57 696  [[ 694 ]]  @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=6; #byte !orig=438,184 !jvms: TestBI::test @ bci:25 (line 9)
    3:  692  LoadB  === 708 57 693  [[ 691 ]]  @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=6; #byte !orig=184 !jvms: TestBI::test @ bci:25 (line 9)
    4:  568  LoadB  === 708 57 569  [[ 567 ]]  @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=6; #byte !orig=438,184 !jvms: TestBI::test @ bci:25 (line 9)
    5:  579  LoadB  === 708 57 580  [[ 578 ]]  @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=6; #byte !orig=184 !jvms: TestBI::test @ bci:25 (line 9)
    6:  438  LoadB  === 708 57 439  [[ 437 ]]  @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=6; #byte !orig=184 !jvms: TestBI::test @ bci:25 (line 9)
    7:  184  LoadB  === 708 57 182  [[ 185 ]]  @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=6; #byte !jvms: TestBI::test @ bci:25 (line 9)

WARNING: Removed pack: not profitable:
    0:  671  CountLeadingZerosI  === _ 672  [[ 669 ]]  !orig=567,437,185 !jvms: TestBI::test @ bci:26 (line 9)
    1:  685  CountLeadingZerosI  === _ 686  [[ 668 ]]  !orig=578,185 !jvms: TestBI::test @ bci:26 (line 9)
    2:  694  CountLeadingZerosI  === _ 695  [[ 667 ]]  !orig=437,185 !jvms: TestBI::test @ bci:26 (line 9)
    3:  691  CountLeadingZerosI  === _ 692  [[ 666 ]]  !orig=185 !jvms: TestBI::test @ bci:26 (line 9)
    4:  567  CountLeadingZerosI  === _ 568  [[ 565 ]]  !orig=437,185 !jvms: TestBI::test @ bci:26 (line 9)
    5:  578  CountLeadingZerosI  === _ 579  [[ 564 ]]  !orig=185 !jvms: TestBI::test @ bci:26 (line 9)
    6:  437  CountLeadingZerosI  === _ 438  [[ 435 ]]  !orig=185 !jvms: TestBI::test @ bci:26 (line 9)
    7:  185  CountLeadingZerosI  === _ 184  [[ 204 ]]  !jvms: TestBI::test @ bci:26 (line 9)

WARNING: Removed pack: not profitable:
    0:  669  StoreI  === 708 709 670 671  [[ 668 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=7;  Memory: @int[int:1024] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=7; !orig=565,435,204,[263],[272] !jvms: TestBI::test @ bci:29 (line 9)
    1:  668  StoreI  === 708 669 680 685  [[ 667 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=7;  Memory: @int[int:1024] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=7; !orig=564,204,[263],[272] !jvms: TestBI::test @ bci:29 (line 9)
    2:  667  StoreI  === 708 668 677 694  [[ 666 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=7;  Memory: @int[int:1024] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=7; !orig=435,204,[263],[272] !jvms: TestBI::test @ bci:29 (line 9)
    3:  666  StoreI  === 708 667 674 691  [[ 565 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=7;  Memory: @int[int:1024] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=7; !orig=204,[263],[272] !jvms: TestBI::test @ bci:29 (line 9)
    4:  565  StoreI  === 708 666 566 567  [[ 564 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=7;  Memory: @int[int:1024] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=7; !orig=435,204,[263],[272] !jvms: TestBI::test @ bci:29 (line 9)
    5:  564  StoreI  === 708 565 570 578  [[ 435 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=7;  Memory: @int[int:1024] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=7; !orig=204,[263],[272] !jvms: TestBI::test @ bci:29 (line 9)
    6:  435  StoreI  === 708 564 436 437  [[ 204 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=7;  Memory: @int[int:1024] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=7; !orig=204,[263],[272] !jvms: TestBI::test @ bci:29 (line 9)
    7:  204  StoreI  === 708 435 202 185  [[ 709 207 420 ]]  @int[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=7;  Memory: @int[int:1024] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=7; !orig=[263],[272] !jvms: TestBI::test @ bci:29 (line 9)


So we would have to add in a `VectorCast` between the packs of `LoadB` and `CountLeadingZerosI`.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23579#issuecomment-2659337152


More information about the hotspot-compiler-dev mailing list