class SplittableRandom

Peter Levart peter.levart at
Mon Jul 29 18:29:18 UTC 2013

On 07/16/2013 06:01 PM, Guy Steele wrote:
> On Jul 16, 2013, at 3:07 AM, Martin Buchholz <martinrb at> wrote:
>> Read Appleby and Stafford ...
>> Hmmm.... mix32 has almost the same job as mix64 - have all the bits in the seed affect all the bits in the output, so the obvious implementation is
>> mix32() { return (int) mix64(); }
>> or more conservatively
>> mix32() { m = mix64(); return (int)m ^ (int)(m >>> 32); }
>> although it would be slightly distressing to have nextInt be more expensive than nextLong.
> Yes---at first I used exactly
>     mix32() { return (int) mix64(); }
> but was worried about the time it took, which is why I searched for a cheaper mixing function.
>> There might be some clever way of doing murmurhash-style mixing when the input is 64-bit and the output is 32-bit that can do better, but I don't think the current mix32 is it.
> Again, I was astonished that so simply a mixing function seemed to do the job as far as Dieharder is concerned.  I agree that it does appear to perform poorly for sparse gamma values.  So now the question is: do such sparse gamma values arise in practice?
> As Doug has pointed out, we carefully made the two-argument constructor private so that the user cannot specify gamma values.  The gamma values are derived only from the gamma-seed 0, using the gamma value GAMMA_GAMMA, and always running the seed through mix64.
> I wrote a little test (code and output below) to generate the first 16 billion (well, 2^34+1) gamma values actually generated and gather certain statistics:
> (1) the number of 1-bits in the gamma value (output of Long.bitCount)
> (2) the "width" of the gamma value (64 - Long.numberOfLeadingBits(gamma) - Long.numberOfTrailingBits(gamma))
> Then it histograms the bit counts, and for each possible bit count tracks the minimum width for gammas having that same bit count.
> Whenever the number of of gammas examined is one more than a power of 2, it prints a report consisting of those two tables and the min over the minwidth table.
> Here's a brief summary:
> For the first 65 gamma values, no gamma value has fewer than 21 1-bits or more than 41 1-bits.  The minimum width is 57.  So I predict there will be no weird mixing problems at all when using SplittableRandom to do a balanced split of a stream of generated randoms.
> For the first 2^16+1 = 65537 gamma values, the minimum bit count is 15, the maximum bit count is 50 and the minimum width is 45.
> For the first 2^24+1 gamma values, the minimum bit count is 12, the maximum bit count is 52 and the minimum width is 35.
> For the first 2^32+1 gamma values, the minimum bit count is 9, the maximum bit count is 55 and the minimum width is 28.
> For the first 2^34+1 gamma values, the minimum bit count is 8, the maximum bit count is 56 and the minimum width is 28.
> So I think we will not hit any really bad gamma values in practice.
> --Guy


Would it be possible to skip gammas with less than 12 or more than 52 bits?

I did some playing with dieharder a week ago, testing SR.nextInt() 
(mix32) and indeed for sparse gamma values, all dieharder tests show 
failure. In particular test id=1 (diehard_operm5) is the least 
forgiving, requiring minimum gamma bit count being about 11 and maximum 
about 54 in order for such "random" gamma to produce nextInt() bit 
streams that pass the diehard_operm5 test. Here's a sample of the first 
3 dieharder tests (0,1 and 2) for differently sparsed "random" gammas:

The alternative:

     public int nextIntAlt1() {
         return (int) mix64(nextSeed());

... does not exhibit this weakness. It seems to be satisfied with gammas 
having as few as 1 or as many as 63 bits set to pass dieharder tests:

But such nextInt is slower, approximately as fast as 
ThreadLocalRandom.nextInt() by my measurements.

Random streams created by SplittableRandom.nextLong() (mix64) also don't 
exhibit weaknesses with sparse gammas, so I thought why not using 
mix64() to produce 64 bits of randomness that is then returned by two 
consecutive calls to nextInt, like that:

     private int nextInt;
     private boolean nextIntZero;

     public int nextIntAlt2() {
         if (nextInt != 0) {
             int result = nextInt;
             nextInt = 0;
             return result;
         } else if (nextIntZero) {
             nextIntZero = false;
             return 0;
         } else {
             long nextLong = mix64(nextSeed());
             nextInt = (int)(nextLong >>> 32);
             if (nextInt == 0) nextIntZero = true;
             return (int)(nextLong & 0xFFFFFFFFL);

The (little-Indian) bitstream produced by nextIntAlt2() is exactly the 
same as that produced by nextLong() and 1st 3 dieharder tests pass with 
sparse gammas too:

  the measurements produced by micro benchmark show performance 
somewhere between nextInt() and nextIntAlt1():

Benchmark                      Thr    Cnt  Sec         Mean Mean 
error          Var    Units
s.g.t.JmhTest.SR_nextInt         1      8    5 477779.670     1151.048   
865504.410 ops/msec
s.g.t.JmhTest.SR_nextIntAlt1     1      8    5 385630.764      513.779   
172439.034 ops/msec
s.g.t.JmhTest.SR_nextIntAlt2     1      8    5 432071.940      572.221   
213899.640 ops/msec
s.g.t.JmhTest.TLR_nextInt        1      8    5 382334.416       
28.907      545.871 ops/msec

Here's the JMH micro benchmark I used:

Here's the code I used to produce dieharder reports:

which uses the following utility class to interface dieharder tool:

Regards, Peter

More information about the core-libs-dev mailing list