class SplittableRandom

Thu Jul 11 07:36:09 UTC 2013

Hi Doug,

I confess I'm not an expert in PRNGs, but I'll try to ask a hopefully 
no-nonsense question anyway. SplittableRandom seems to be a kind of 
PR-PRNG-G - a PseudoRandom-PseudoRandomGenerator-Generator. In a sense 
that it's split() method returns an instance that represents a PRNG 
which produces numbers of a different sequence than it's parent (it has 
it's own distinct gamma). So not only does a child SplittableRandom 
instance return numbers from different part of same sequence as parent, 
but from a different sequence altogether.

While it is not obvious from the javadocs, repeatable invocations of 
split() performed on the same instance, return SplittableRandom 
instances that represent the same PRNG (same gamma), just initialized 
with different seeds. Is this important to know?

Regards, Peter

On 07/10/2013 09:13 PM, Doug Lea wrote:
> [Note: I'm also posting this on the concurrency-interest list.]
>
>
> We expect that using random numbers in parallel Stream computations
> will be common. (We know it is common in parallel computing in
> general.) But we had left support for it in an unsatisfactory state.
> If you want to create a stream of random numbers to drive a parallel
> computation, you'd choose among two options, neither of them providing
> what you probably want: (1) Use a stream based on a single shared
> java.util.Random object, in which case your program will encounter
> stunning slowdowns when run with many cores; or (2) Use a stream based
> on ThreadLocalRandom, which avoids contention, but gives you no
> control over the use or properties of the per-thread singleton Random
> object. While the ThreadLocalRandom option is great for many purposes,
> you wouldn't want to use it in, say, a high-quality Monte Carlo
> simulation.
>
> Enter Guy Steele. Guy has been working on an algorithm that addresses
> exactly the substantial range of uses not otherwise supported: It is,
> in essence, the Random number generator analog of a Spliterator. Class
> SplittableRandom supports method split() that creates a sub-generator
> that when used in parallel with the original, maintains its
> statistical properties.
>
> When Brian Goetz and I heard that this was nearing completion, we
> entered drop-everything mode to explore whether it could be added now
> in time for JDK8. We conclude that it should. We've been helping with
> JDK-ifying the basic algorithm, integrating java.util.Stream support,
> etc, to enable addition as class java.util.SplittableRandom.
>
> Just to be on the cautious side though, we are for the moment treating
> this in the same way we treat jsr166<n> candidates for potential
> OpenJDK integration. The initial version is available at
> http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/SplittableRandom.java?view=log 
>
> With API docs at:
> http://gee.cs.oswego.edu/dl/jsr166/dist/docs/java/util/SplittableRandom.html 
>
>
> This post serves as a request for comment, with shorter than usual
> turnaround (a couple of days) before considering a request to
> integrate into OpenJDK 8. So, please take a look.
>
> Here are answers to some likely questions:
>
> Q: How much faster is it than java.util.Random?
>
> A: In sequential usages, usually at least twice as fast for long and
> double methods; usually only slightly faster for int methods.  In
> parallel usages, SplittableRandom is almost arbitrarily faster. The
> very first simple parallel Stream program I wrote (to generate and sum
> nextLong()'s) ran 2900 times faster than the java.util.Random
> equivalent on a 32-way machine.
>
> Q: When can/should I use it instead of java.util.Random?
>
> A: Whenever you are not sharing one across Threads. Instances of
> SplittableRandom are not thread-safe. They are designed to be split,
> not shared, across threads.  When class SplittableRandom applies (or
> you can rework your program to make it apply), it is usually a better
> choice.  Not only is it usually faster, it also has better statistical
> independence and uniformity properties.
>
> Q: When can/should I use it instead of 
> java.util.concurrent.ThreadLocalRandom?
>
> A: When you are doing structured fork/join computations, so you can
> explicitly split one rather than relying on the per-thread singleton
> instance.
>
> Q: Why is this in java.util, not java.util.concurrent?
>
> A: Because, like java.util.Spliterator, SplittableRandom is a tool for
> arranging isolated parallel computations that don't entail any
> concurrency control themselves.
>
> Q: Why isn't SplittableRandom a subclass of Random?
>
> A: Class Random requires thread-safety in its spec. It would be
> nonsensical for SplittableRandom to comply.
>
> Q: Why don't you at least come up with a new interface that
> defines methods shared with java.util.Random?
>
> A: We spent a couple of days exploring this. We think it could and
> probably should be done, but not now. Method names and specs of
> SplittableRandom are chosen to make it possible. But we encountered
> enough short-term obstacles to conclude that this is an unwise move
> for JDK8. Among the issues are that we'd need to adjust some specs and
> possibly some code in java.util.Random, and that we are at a loss
> about whether or how to generalize SplittableRandom's added Stream
> methods. In the mean time, it would be more than acceptable for
> SplittableRandom to be used primarily in new code (or new adaptions of
> old code) that wouldn't need or want to be interoperable with code
> using java.util.Random.
>
> Q: Are we going to revisit with SplittableRandom all those memory
> contention issues we saw with ThreadLocalRandom?
>
> A: Most likely not. Most of the memory contention issues surrounding
> ThreadLocalRandom arise because they are long-lived. SplittableRandoms
> will tend to be short-lived. In any case, now that we have the tools
> to cope (@Contended), we can evaluate and adjust if more detailed
> empirical analysis warrants.
>