Class SplittableRandom
Doug Lea
dl at cs.oswego.edu
Wed Jul 10 12:09:57 PDT 2013
[Note: I'll be posting this separately to openjdk core-libs and
concurrency-interest]
We expect that using random numbers in parallel Stream computations
will be common. (We know it is common in parallel computing in
general.) But we had left support for it in an unsatisfactory state.
If you want to create a stream of random numbers to drive a parallel
computation, you'd choose among two options, neither of them providing
what you probably want: (1) Use a stream based on a single shared
java.util.Random object, in which case your program will encounter
stunning slowdowns when run with many cores; or (2) Use a stream based
on ThreadLocalRandom, which avoids contention, but gives you no
control over the use or properties of the per-thread singleton Random
object. While the ThreadLocalRandom option is great for many purposes,
you wouldn't want to use it in, say, a high-quality Monte Carlo
simulation.
Enter Guy Steele. Guy has been working on an algorithm that addresses
exactly the substantial range of uses not otherwise supported: It is,
in essence, the Random number generator analog of a Spliterator. Class
SplittableRandom supports method split() that creates a sub-generator
that when used in parallel with the original, maintains its
statistical properties.
When Brian Goetz and I heard that this was nearing completion, we
entered drop-everything mode to explore whether it could be added now
in time for JDK8. We conclude that it should. We've been helping with
JDK-ifying the basic algorithm, integrating java.util.Stream support,
etc, to enable addition as class java.util.SplittableRandom.
Just to be on the cautious side though, we are for the moment treating
this in the same way we treat jsr166<n> candidates for potential
OpenJDK integration. The initial version is available at
http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/SplittableRandom.java?view=log
With API docs at:
http://gee.cs.oswego.edu/dl/jsr166/dist/docs/java/util/SplittableRandom.html
This post serves as a request for comment, with shorter than usual
turnaround (a couple of days) before considering a request to
integrate into OpenJDK 8. So, please take a look.
Here are answers to some likely questions:
Q: How much faster is it than java.util.Random?
A: In sequential usages, usually at least twice as fast for long and
double methods; usually only slightly faster for int methods. In
parallel usages, SplittableRandom is almost arbitrarily faster. The
very first simple parallel Stream program I wrote (to generate and sum
nextLong()'s) ran 2900 times faster than the java.util.Random
equivalent on a 32-way machine.
Q: When can/should I use it instead of java.util.Random?
A: Whenever you are not sharing one across Threads. Instances of
SplittableRandom are not thread-safe. They are designed to be split,
not shared, across threads. When class SplittableRandom applies (or
you can rework your program to make it apply), it is usually a better
choice. Not only is it usually faster, it also has better statistical
independence and uniformity properties.
Q: When can/should I use it instead of java.util.concurrent.ThreadLocalRandom?
A: When you are doing structured fork/join computations, so you can
explicitly split one rather than relying on the per-thread singleton
instance.
Q: Why is this in java.util, not java.util.concurrent?
A: Because, like java.util.Spliterator, SplittableRandom is a tool for
arranging isolated parallel computations that don't entail any
concurrency control themselves.
Q: Why isn't SplittableRandom a subclass of Random?
A: Class Random requires thread-safety in its spec. It would be
nonsensical for SplittableRandom to comply.
Q: Why don't you at least come up with a new interface that
defines methods shared with java.util.Random?
A: We spent a couple of days exploring this. We think it could and
probably should be done, but not now. Method names and specs of
SplittableRandom are chosen to make it possible. But we encountered
enough short-term obstacles to conclude that this is an unwise move
for JDK8. Among the issues are that we'd need to adjust some specs and
possibly some code in java.util.Random, and that we are at a loss
about whether or how to generalize SplittableRandom's added Stream
methods. In the mean time, it would be more than acceptable for
SplittableRandom to be used primarily in new code (or new adaptions of
old code) that wouldn't need or want to be interoperable with code
using java.util.Random.
Q: Are we going to revisit with SplittableRandom all those memory
contention issues we saw with ThreadLocalRandom?
A: Most likely not. Most of the memory contention issues surrounding
ThreadLocalRandom arise because they are long-lived. SplittableRandoms
will tend to be short-lived. In any case, now that we have the tools
to cope (@Contended), we can evaluate and adjust if more detailed
empirical analysis warrants.
More information about the lambda-libs-spec-experts
mailing list