class SplittableRandom

Tue Jul 30 14:25:45 UTC 2013

Hi Peter,

Given that the user cannot specify gamma values and the for the first  2^16+1 gamma values the minimum bit count is 15, the maximum bit count is 50 and the minimum width is 45, is that not sufficient? i.e. is it likely there will be more than 65537 splits? 

In the streams API one would be hard pressed to reach this limit, even for splitting an iterator. If i calculated things correctly with an arithmetic progression step size of 1 the limits of the max collection size will be reached (currently we use a step size of 1024) and one is likely to hit other problems way before due to issues with reduction on right-heavy trees.

Paul.

On Jul 29, 2013, at 7:29 PM, Peter Levart <peter.levart at gmail.com> wrote:

> 
> On 07/16/2013 06:01 PM, Guy Steele wrote:
>> On Jul 16, 2013, at 3:07 AM, Martin Buchholz <martinrb at google.com> wrote:
>>> Read Appleby and Stafford ...
>>> 
>>> Hmmm.... mix32 has almost the same job as mix64 - have all the bits in the seed affect all the bits in the output, so the obvious implementation is
>>> 
>>> mix32() { return (int) mix64(); }
>>> or more conservatively
>>> mix32() { m = mix64(); return (int)m ^ (int)(m >>> 32); }
>>> although it would be slightly distressing to have nextInt be more expensive than nextLong.
>> Yes---at first I used exactly
>>    mix32() { return (int) mix64(); }
>> but was worried about the time it took, which is why I searched for a cheaper mixing function.
>> 
>>> There might be some clever way of doing murmurhash-style mixing when the input is 64-bit and the output is 32-bit that can do better, but I don't think the current mix32 is it.
>> Again, I was astonished that so simply a mixing function seemed to do the job as far as Dieharder is concerned.  I agree that it does appear to perform poorly for sparse gamma values.  So now the question is: do such sparse gamma values arise in practice?
>> 
>> As Doug has pointed out, we carefully made the two-argument constructor private so that the user cannot specify gamma values.  The gamma values are derived only from the gamma-seed 0, using the gamma value GAMMA_GAMMA, and always running the seed through mix64.
>> 
>> I wrote a little test (code and output below) to generate the first 16 billion (well, 2^34+1) gamma values actually generated and gather certain statistics:
>> (1) the number of 1-bits in the gamma value (output of Long.bitCount)
>> (2) the "width" of the gamma value (64 - Long.numberOfLeadingBits(gamma) - Long.numberOfTrailingBits(gamma))
>> Then it histograms the bit counts, and for each possible bit count tracks the minimum width for gammas having that same bit count.
>> Whenever the number of of gammas examined is one more than a power of 2, it prints a report consisting of those two tables and the min over the minwidth table.
>> 
>> Here's a brief summary:
>> 
>> For the first 65 gamma values, no gamma value has fewer than 21 1-bits or more than 41 1-bits.  The minimum width is 57.  So I predict there will be no weird mixing problems at all when using SplittableRandom to do a balanced split of a stream of generated randoms.
>> 
>> For the first 2^16+1 = 65537 gamma values, the minimum bit count is 15, the maximum bit count is 50 and the minimum width is 45.
>> 
>> For the first 2^24+1 gamma values, the minimum bit count is 12, the maximum bit count is 52 and the minimum width is 35.
>> 
>> For the first 2^32+1 gamma values, the minimum bit count is 9, the maximum bit count is 55 and the minimum width is 28.
>> 
>> For the first 2^34+1 gamma values, the minimum bit count is 8, the maximum bit count is 56 and the minimum width is 28.
>> 
>> So I think we will not hit any really bad gamma values in practice.
>> 
>> --Guy
>> 
> 
> Hi,
> 
> Would it be possible to skip gammas with less than 12 or more than 52 bits?
> 
> I did some playing with dieharder a week ago, testing SR.nextInt() (mix32) and indeed for sparse gamma values, all dieharder tests show failure. In particular test id=1 (diehard_operm5) is the least forgiving, requiring minimum gamma bit count being about 11 and maximum about 54 in order for such "random" gamma to produce nextInt() bit streams that pass the diehard_operm5 test. Here's a sample of the first 3 dieharder tests (0,1 and 2) for differently sparsed "random" gammas:
> 
> https://raw.github.com/plevart/lambda-hacks/SplittableRandom/SplittableRandom/nextInt_results_0to2.txt
> 
> The alternative:
> 
>    public int nextIntAlt1() {
>        return (int) mix64(nextSeed());
>    }
> 
> ... does not exhibit this weakness. It seems to be satisfied with gammas having as few as 1 or as many as 63 bits set to pass dieharder tests:
> 
> https://raw.github.com/plevart/lambda-hacks/SplittableRandom/SplittableRandom/nextIntAlt1_results_0to2.txt
> 
> But such nextInt is slower, approximately as fast as ThreadLocalRandom.nextInt() by my measurements.
> 
> Random streams created by SplittableRandom.nextLong() (mix64) also don't exhibit weaknesses with sparse gammas, so I thought why not using mix64() to produce 64 bits of randomness that is then returned by two consecutive calls to nextInt, like that:
> 
>    private int nextInt;
>    private boolean nextIntZero;
> 
>    public int nextIntAlt2() {
>        if (nextInt != 0) {
>            int result = nextInt;
>            nextInt = 0;
>            return result;
>        } else if (nextIntZero) {
>            nextIntZero = false;
>            return 0;
>        } else {
>            long nextLong = mix64(nextSeed());
>            nextInt = (int)(nextLong >>> 32);
>            if (nextInt == 0) nextIntZero = true;
>            return (int)(nextLong & 0xFFFFFFFFL);
>        }
>    }
> 
> The (little-Indian) bitstream produced by nextIntAlt2() is exactly the same as that produced by nextLong() and 1st 3 dieharder tests pass with sparse gammas too:
> 
> https://raw.github.com/plevart/lambda-hacks/SplittableRandom/SplittableRandom/nextIntAlt2_results_0to2.txt
> 
> the measurements produced by micro benchmark show performance somewhere between nextInt() and nextIntAlt1():
> 
> Benchmark                      Thr    Cnt  Sec         Mean Mean error          Var    Units
> s.g.t.JmhTest.SR_nextInt         1      8    5 477779.670     1151.048   865504.410 ops/msec
> s.g.t.JmhTest.SR_nextIntAlt1     1      8    5 385630.764      513.779   172439.034 ops/msec
> s.g.t.JmhTest.SR_nextIntAlt2     1      8    5 432071.940      572.221   213899.640 ops/msec
> s.g.t.JmhTest.TLR_nextInt        1      8    5 382334.416       28.907      545.871 ops/msec
> 
> 
> Here's the JMH micro benchmark I used:
> 
> https://github.com/plevart/lambda-hacks/blob/SplittableRandom/SplittableRandom/src/sr/JmhTest.java
> 
> 
> Here's the code I used to produce dieharder reports:
> 
> https://github.com/plevart/lambda-hacks/blob/SplittableRandom/SplittableRandom/src/sr/Test.java
> 
> which uses the following utility class to interface dieharder tool:
> 
> https://github.com/plevart/lambda-hacks/blob/SplittableRandom/SplittableRandom/src/si/pele/dieharder/DieharderTest.java
> 
> 
> Regards, Peter
>