Long256Vector::mul is not compiling correctly

Adam Pocock adam.pocock at oracle.com
Mon Aug 13 18:49:58 UTC 2018


Below is a minimal example where the output of the multiply changes. Run 
with the argument 10000 (which is sufficient to trigger compilation on 
my desktop), start and end are different. With fewer iterations (or with 
-Xint or -XX:TieredStopAtLevel=3) start and end are the same.

With print compilation turned on I get:
...
At itr 9475, drew 
2885783399502042673,904462311702380608,-5120899773137696113,3753560392746735454
     308  502       3       jdk.incubator.vector.Long256Vector::mul (12 
bytes)   made not entrant
     308  577       4       jdk.incubator.vector.LongVector::toArray (18 
bytes)
At itr 9476, drew 
2885783399502042673,904462311702380608,-5120899773137696113,3753560392746735454
At itr 9477, drew 
2885783399502042673,904462311702380608,2886375887584873103,910387193500211038
...

Thanks,

Adam

import jdk.incubator.vector.LongVector;
import jdk.incubator.vector.LongVector.LongSpecies;
import jdk.incubator.vector.Shapes;
import jdk.incubator.vector.Shapes.S256Bit;
import jdk.incubator.vector.Vector;

import java.util.Arrays;

public class LongVectorMulTest {
     private static final long FIRST_CONSTANT = 0xbf58476d1ce4e5b9L;

     public static void main(String[] args) {
         int numRepeats = Integer.parseInt(args[0]);

         long[] input = new long[]{12345,123456,1234567,12345678};

         LongSpecies<S256Bit> lSpec = (LongSpecies<S256Bit>) 
Vector.species(long.class,Shapes.S_256_BIT);
         long[] start = 
lSpec.fromArray(input,0).mul(FIRST_CONSTANT).toArray();
         long[] end = null;
         for (int i = 0; i < numRepeats; i++) {
             LongVector<S256Bit> output = 
lSpec.fromArray(input,0).mul(FIRST_CONSTANT);
             end = output.toArray();
             print(end, i);
         }
         System.out.println("Start = " + Arrays.toString(start));
         System.out.println("End = " + Arrays.toString(end));
     }

     private static void print(long[] values, int iter) {
         StringBuilder builder = new StringBuilder();
         for (int j = 0; j < values.length; j++) {
             builder.append(values[j]);
             builder.append(',');
         }
         builder.deleteCharAt(builder.length()-1);
         System.out.println("At itr " + iter + ", drew " + 
builder.toString());
     }
}

On 13/08/18 11:37, Adam Pocock wrote:
> Long256Vector::mul produces different output once C2 has compiled it. 
> I've been running down some non-determinism in my vector version of 
> SplittableRandom, and the behaviour changes when C2 compiles it. I 
> noticed this as the stream of random numbers changes when C2 compiles 
> Long256Vector::mul(long). Only the latter half of the vector is 
> affected, the first two lanes still produce the same output. I'm 
> pretty sure it's a C2 issue as running with -Xint or 
> -XX:TieredStopAtLevel=3 makes the output repeatable across runs.
>
> I've attached two runs with -XX:+PrintCompilation turned on. At line 
> 5794 (itr 4920) in the first file (equivalent point is line 5774 in 
> the second file) Long256Vector::mul is compiled, and after that point 
> the second half of the vector output changes between runs. At line 
> 7427 (itr 6561) in the second file Long256Vector::mul is compiled 
> (equivalent point is line 7463 in the first file), and after that all 
> the draws are the same. This indicates that the seeds for the RNG area 
> always updated correctly as otherwise the runs would diverge after 
> compilation.
>
> This suggests to me that the latter two lanes of C2 compiled 
> Long256Vector::mul(long) aren't correct, as comparing an interpreted 
> or C1 run against a run with C2 the latter two lanes diverge 
> permanently after C2 compilation.
>
> The JVM reports:
>     openjdk version "12-internal" 2019-03-19
>     OpenJDK Runtime Environment (build 
> 12-internal+0-adhoc.apocock.panama)
>     OpenJDK 64-Bit Server VM (build 
> 12-internal+0-adhoc.apocock.panama, mixed mode)
>
> and hg log shows the top commit for vectorIntrinsics as:
>     changeset:   51784:4408db20792c
>     branch:      vectorIntrinsics
>     tag:         tip
>     parent:      51782:1700bec0e3b5
>     user:        rkandu
>     date:        Wed Aug 01 06:43:33 2018 -0700
>     summary:     SVML Linux .s files and Windows build fix
>
> The only lines which use mul on a Long256Vector are in the vectorised 
> mix64:
>     private static final long FIRST_CONSTANT = 0xbf58476d1ce4e5b9L;
>     private static final long SECOND_CONSTANT = 0x94d049bb133111ebL;
>     private LongVector<S> mix64(LongVector<S> input) {
>         input = input.xor(input.shiftR(30)).mul(FIRST_CONSTANT);
>         input = input.xor(input.shiftR(27)).mul(SECOND_CONSTANT);
>         return input.xor(input.shiftR(31));
>     }
>
> My CPU is a Core-i7 6700k, so I expect everything to be using AVX2, 
> though I haven't verified it's emitting those instructions.
>
> PS: @Razvan, you were right, my BinarySearch was bugged, I was using 
> lessThan rather than lessThanEq to update the loop condition mask. 
> Once I fixed that it produces identical output to the sequential 
> version I was using.
>
> Thanks,
>
> Adam
>

-- 
Adam Pocock
Principal Member of Technical Staff
Machine Learning Research Group
Oracle Labs, Burlington, MA



More information about the panama-dev mailing list