class SplittableRandom

Tue Jul 16 18:57:24 UTC 2013

On Jul 15, 2013, at 4:13 PM, Martin Buchholz <martinrb at google.com> wrote:
> 
> On Mon, Jul 15, 2013 at 9:59 AM, Martin Buchholz <martinrb at google.com>wrote:
>> 
>> Also, if gamma is much less than 2^64 (say 2^56; that number of gammas
>> "should be enough for anybody"), then the above will be a little more
>> efficient since the wraparound code will be rare and well-predicted.  The
>> bits that become available as a result can then be optionally donated to
>> the seed space!

I just looked into this.  It's not a good idea to generate 64-bit gamma values and
then filter them; it's too costly to do rejection sampling, and if you just throw away
bits, you risk getting duplicate gamma values, which is undesirable.

So instead I took the latest version and modified the generation of gamma values
to use a 56-bit GAMMA_GAMMA value and perform arithmetic modulo 2^56-5
(I will call this prime "Arthur").  I also threw together a 56-bit mixing function "mix56"
by simply adding a little bit of masking to mix64; I emphasize, however, that I have not
yet performed avalanche testing on mix56.  The point of mix56 is that it is (I think)
a bijective function of 56-bit values, whereas applying mix64 and then discarding
8 bits would not be.  Therefore all the generated gamma values should be different.

    private static final long GAMMA_PRIME = (1L << 56) - 5;       // Arthur, a prime number
    private static final long GAMMA_GAMMA = 0x00281E2DBA6606F3L;  // must be less than Arthur

    private GreatestRandom(long seed, long splitSeed) {
        this.seed = seed;
        long s = (splitSeed & 0x7FFFFFFFFFFFFFFFL) % GAMMA_PRIME, g;
        do { // ensure gamma >= 13, considered as an unsigned integer
            s += GAMMA_GAMMA;
	    if (s >= GAMMA_PRIME) s -= GAMMA_PRIME;
            g = mix56(s);
        } while (g >= 0L && g < 13L);
        this.gamma = g;
        this.nextSplit = s;
    }
    private static long mix56(long z) {
        z ^= (z >>> 33);
        z *= 0xff51afd7ed558ccdL;
	z &= 0x00FFFFFFFFFFFFFFL;
        z ^= (z >>> 33);
        z *= 0xc4ceb9fe1a85ec53L;
	z &= 0x00FFFFFFFFFFFFFFL;
        z ^= (z >>> 33);
        return z;
    }

So here's the punchline: this change makes my Monte Carlo calculation of pi run over 19% faster!
Hard to believe.  It's certainly avoiding most of the hard work in addGammaModGeorge,
and perhaps also making branch prediction totally win, but 19% ???  Someone please verify this
independently.

Still to be done: (a) verify avalanche behavior of mix56; (b) run more DieHarder tests.

--Guy