class SplittableRandom

Tue Jul 16 07:07:39 UTC 2013

On Mon, Jul 15, 2013 at 2:05 PM, Guy Steele <guy.steele at oracle.com> wrote:

>
> On Jul 15, 2013, at 4:13 PM, Martin Buchholz <martinrb at google.com> wrote:
>
> > I'll be your loyal adversary today, trying to break SplittableRandom.>
> mix32 looks too simple to me to do "enough mixing".  I'm surprised running
> > nextInt through DieHarder would pass - it's close to linear congruential.
>
> I was surprised, too, or perhaps I should say gratified, that this worked.
> I was searching semi-methodically for just such a simple mixing function,
> testing them by examining avalanche statistics.  The low-order 32 bits
> produced
> by mix32 are pretty bad, but the upper 32 bits appear to be quite well
> mixed.
> It's amazing what a little bit of XORing does to disrupt the uniformities
> of
> the linear-congruential method.
>
>
Read Appleby and Stafford ...

Hmmm.... mix32 has almost the same job as mix64 - have all the bits in the
seed affect all the bits in the output, so the obvious implementation is

mix32() { return (int) mix64(); }
or more conservatively
mix32() { m = mix64(); return (int)m ^ (int)(m >>> 32); }
although it would be slightly distressing to have nextInt be more expensive
than nextLong.

There might be some clever way of doing murmurhash-style mixing when the
input is 64-bit and the output is 32-bit that can do better, but I don't
think the current mix32 is it.

> You also wrote:
>
> > The algorithm in SplittableRandom is much better, although even here I'm
> not
> > sure we have enough bits of state to satisfy all non-cryptographic uses.
>
> > I continue to worry that we are producing a PRNG that will end up not
> being
> > acceptable to the Monte Carlo simulation community.
>
> And I agree.  I comfort myself with the thought that even though this
> algorithm
> does not allow any single object to generate the same 64-bit value twice
> consecutively,
> that observation does not apply to double values (which use only 53 bits
> apiece).
> For applications that use doubles rather than longs, SplittableRandom may
> be okay.
>
> On the whole, I would really prefer to use 128-bit seeds and 128-bit gamma
> values.
> If only support for 128-bit integers were built into Java, with good use
> of the condition
> codes and add-with-carry instructions at the machine level, this would be
> a very plausible
> thing to do.  I view the currently proposed 64-bit,
> implemented-entirely-in-Java version
> as adequate for a lot of purposes, and way better than java.util.Random
> for a lot of
> purposes, and therefore perhaps easy to accept into JDK8.  But we want to
> avoid
> committing to too specific a mechanism; I certainly don't think this
> algorithm is the
> last word, but rather the first step toward getting much better parallel
> RNGs.
> Maybe for JDK9 we could have 128-bit algorithms supported by tightly
> written
> native machine code at very little extra computational expense, but first
> we need
> to get experience with this class of algorithms.  Just a thought.
>

I agree.  I have long thought that 64-bit randomness isn't quite enough,
and that 128-bits is just about right, but few PRNGs inhabit that space.

---

I suggest code based on the following to check for randomness.  For
gammabits=14, I see dups of diffs in the output, e.g.
     [java] next=e1e77ea6 diff=4c84b7e8
     [java] next=8329b1b9 diff=a1423313
     [java] next=246be4cc diff=a1423313
but not when gammabits=18.  I conjecture that gammabits has to be cranked
up a bit more before it passes dieharder.

    public void testAdversary(int gammabits) {
        Random rnd = new Random();
        long seed = rnd.nextLong();
        long gamma = (rnd.nextLong() | 1L) << (64 - gammabits);
        SplittableRandom r = new SplittableRandom(seed, gamma, new Error());
        System.out.printf("testAdversary(%d)%n", gammabits);
        int prev = 0;
        for (int i = 0; i < 100; i++) {
            int next = r.nextInt();
            System.out.printf("next=%08x diff=%08x%n", next, next - prev);
            prev = next;
        }
    }

    public void testAdversary14() { testAdversary(14); }
    public void testAdversary16() { testAdversary(16); }
    public void testAdversary18() { testAdversary(18); }

---

Here's a version of addGammaModGeorge that uses the slightly clearer
Long.MIN_VALUE + 13L instead of 0x800000000000000DL and generates identical
bytecode.

    private static long addGammaModGeorge(long sx, long g) {
        long px = sx + g;
        return (px >= sx) ? px :
            ((px >= Long.MIN_VALUE + 13L) ? px : px + g) - 13L;
    }