class SplittableRandom

Thu Jul 11 12:16:48 UTC 2013

On 07/11/13 03:36, Peter Levart wrote:
> Hi Doug,
>
> I confess I'm not an expert in PRNGs, but I'll try to ask a hopefully
> no-nonsense question anyway. SplittableRandom seems to be a kind of PR-PRNG-G -
> a PseudoRandom-PseudoRandomGenerator-Generator. In a sense that it's split()
> method returns an instance that represents a PRNG which produces numbers of a
> different sequence than it's parent (it has it's own distinct gamma). So not
> only does a child SplittableRandom instance return numbers from different part
> of same sequence as parent, but from a different sequence altogether.
>
> While it is not obvious from the javadocs, repeatable invocations of split()
> performed on the same instance, return SplittableRandom instances that represent
> the same PRNG (same gamma), just initialized with different seeds. Is this
> important to know?
>

I don't think it's important to know the mechanics as a user of the
class. In fact uses aren't even *allowed* to know them -- we are
careful in the API specs not to promise anything except good
statistical RNG properties.

But your description is mostly accurate: Each instance
seeds its split-off instances. The main algorithmic challenge
is to find a version of this scheme that has good
RNG properties -- both analytically good, and empirically good
via DieHarder tests. SplittableRandom's algorithm does, and is
simpler and faster than any others we know of.

While I'm at it, here are a few follow-ups about
SplittableRandom vs ThreadLocalRandom.

There's no simple  "one is better than the other" argument.
They differ in multiple ways:

* SplittableRandom has better statistical properties.
Among them: The base algorithm in java.util.Random, shared
by ThreadLocalRandom (as well as similar versions used
in C "rand48" and elswehere) is known to have some weaknesses
(it does not pass DieHarder). In particular, lower bits of
consecutive values are less independent than higher bits.
(I've been contemplating re-exploring alternatives in TLR,
but the options are more limited because it is a subclass
of Random. Which made sense at the time I did it, but ...)

* Ignoring memory contention etc, ThreadLocalRandom is
generally faster for nextInt, but slower for long and double methods.

* SplittableRandom applies nicely in "structured parallelism"
contexts (for/join, spliterators, etc). ThreadLocalRandom
applies nicely in most unstructured-async contexts.

* The two classes take different approaches to the memory-effects
issues that haunt us core-parallel/concurrent component developers.
ThreadLocalRandom keeps the state with the thread, which, after
lots of help from Aleksey, works great. SplittableRandom keeps
the state with the computation, which (we hope/predict) also
works great. But neither is always best.

Also, a note to those testing either or both: Especially if
you are running on a multisocket-multicore, be sure to
use either -XX:+UseCondCardMark or -XX:+UseG1GC. Otherwise
the memory contention byproducts of garbage collector
bookkeeping are likely to overwhelm other memory effects.

-Doug