class SplittableRandom

Aleksey Shipilev aleksey.shipilev at oracle.com
Wed Jul 10 20:49:28 UTC 2013


(juggling an Occam's razor)

Hi Doug,

On 10.07.2013, at 23:13, Doug Lea <dl at cs.oswego.edu> wrote:
> This post serves as a request for comment, with shorter than usual
> turnaround (a couple of days) before considering a request to
> integrate into OpenJDK 8. So, please take a look.

I've glanced over the paper this work is based on, and did a few quick sweeps on SR code. My first reaction is: while interesting, not entirely convincing.

> Here are answers to some likely questions:
> 
> Q: How much faster is it than java.util.Random?
> 
> A: In sequential usages, usually at least twice as fast for long and
> double methods; usually only slightly faster for int methods.  In
> parallel usages, SplittableRandom is almost arbitrarily faster. The
> very first simple parallel Stream program I wrote (to generate and sum
> nextLong()'s) ran 2900 times faster than the java.util.Random
> equivalent on a 32-way machine.

D'oh, of course it is ludicrously faster than j.u.Random. I am not convinced it beats TLR though, given TLR is super-optimized now in JDK 8. Are you really implying you can beat < 10 cycles? I will check that once I get back in office in about 12 hours.


> Q: When can/should I use it instead of java.util.Random?
> 
> A: Whenever you are not sharing one across Threads. Instances of
> SplittableRandom are not thread-safe. They are designed to be split,
> not shared, across threads.  When class SplittableRandom applies (or
> you can rework your program to make it apply), it is usually a better
> choice.  Not only is it usually faster, it also has better statistical
> independence and uniformity properties.

Being the same linear congruental generator with a splash of bit twisting, it does not immediately follow how's the statistical properties are better. If it is actually true, why don't we fix TLR with more robust generation scheme and gain the same properties there?

Also, I am not convinced passing DieHarder on some split sequence (was it really the non-empty sequence?) extends to arbitrary splits.

> 
> Q: When can/should I use it instead of java.util.concurrent.ThreadLocalRandom?
> 
> A: When you are doing structured fork/join computations, so you can
> explicitly split one rather than relying on the per-thread singleton
> instance.

And gain what vs. TLR? That the pseudo-random sequence in each subtask will be oblivious to task scheduling? This seems to be the distinguishing feature, but not really obvious from the docs.

-Aleksey



More information about the core-libs-dev mailing list