[concurrency-interest] class SplittableRandom

Fri Jul 12 03:28:21 PDT 2013

I am not sure if this is an important use-case or not but it is fun to think about.

One solution is to write a zipper for a tuple of {List<T>, rn}, but, when mapping, it would incur a l.get(rn % l.size()) that might result in over-represented values (but perhaps not if the range is a power of 2??).

A better solution is to write your own spliterator that defers to an underlying spliterator, tracks splits, and reports a tuple of {List<T>, SplittableRange} when traversed e.g:

  Stream<List<List<Integer>> s = ...
  withSplittableRandom(s, new SplittableRandom()).map(t -> a().get(t.b().nextInt(t.a().size()))

The withSplittableRandom would do something like:

  return StreamSupport.stream(createWrappingSplittableRandomSpliterator(s.spliterator(), sr), false);

I have not tested the code below but i think it should work if you want to play with that.

Paul.

    interface Pair<A, B> {
        A a();
        B b();

        static <A, B> Pair<A, B> of(A a, B b) {
            return new Pair<A, B>() {
                @Override
                public A a() {
                    return a;
                }

                @Override
                public B b() {
                    return b;
                }
            };
        }
    }

    public class WrappingSplittableRandomSpliterator<T> implements Spliterator<Pair<T, SplittableRandom>> {

        final Spliterator<T> s;

        final SplittableRandom sr;

        public WrappingSplittableRandomSpliterator(Spliterator<T> s, SplittableRandom sr) {
            this.s = s;
            this.sr = sr;
        }

        Consumer<T> pairConsumer(Consumer<? super Pair<T, SplittableRandom>> action) {
            return e -> action.accept(Pair.of(e, sr));
        }

        @Override
        public boolean tryAdvance(Consumer<? super Pair<T, SplittableRandom>> action) {
            return s.tryAdvance(pairConsumer(action));
        }

        @Override
        public void forEachRemaining(Consumer<? super Pair<T, SplittableRandom>> action) {
            s.forEachRemaining(pairConsumer(action));
        }

        @Override
        public WrappingSplittableRandomSpliterator<T> trySplit() {
            Spliterator<T> split = s.trySplit();
            return split == null
                   ? null
                   : new WrappingSplittableRandomSpliterator<>(split, sr.split());
        }

        @Override
        public long estimateSize() {
            return s.estimateSize();
        }

        @Override
        public long getExactSizeIfKnown() {
            return s.getExactSizeIfKnown();
        }

        @Override
        public int characteristics() {
            return s.characteristics();
        }

        @Override
        public boolean hasCharacteristics(int characteristics) {
            return s.hasCharacteristics(characteristics);
        }

        @Override
        public Comparator<? super Pair<T, SplittableRandom>> getComparator() {
            Comparator<? super T> c = s.getComparator();
            return c == null
                   ? null
                   : (p1, p2) -> c.compare(p1.a(), p2.a());
        }
    }

On Jul 11, 2013, at 4:29 PM, Peter Levart <peter.levart at gmail.com> wrote:

> On 07/11/2013 01:14 AM, Aaron Grunthal wrote:
>> Would this be of any use in a case like this?
>> 
>> List<List<Int>> a;
>> 
>> a.parallelStream().filter(...).map(e->e.get(random.nextInt(e.size()))).reduce(...) 
>> 
>> 
>> i.e. a filter, random sample, reduce chain?
>> 
>> I don't see a way to basically use two (or N) streams (the source and 
>> the random numbers) and merge them in lockstep at one stage.
>> 
> 
> In the absence of Stream.join(), we would need something like the 
> following in the Stream API:
> 
> Stream<T> {
> 
>         <R, S> Stream<R> splitMap(S seed,
>                                   UnaryOperator<S> splitter,
>                                   BiFunction<? super T, ? super S, ? 
> extends R> mapper);
> 
> 
> Your example would then read:
> 
> List<List<Int>> a = ...;
> 
> a.parallelStream()
> .filter(...)
> .splitMap(
> new SplittableRandom(),
>         sr -> sr.split(),
> (list, sr) -> list.get(sr.nextInt(list.size()))
> )
> .reduce(...)
> 
> 
> But I don't know if such API is usable in any other scenarios though.
> 
> Regards, Peter
> 
> 
>> On 10.07.2013 21:13, Doug Lea wrote:
>>> [Note: I'm also posting this on the openjdk core-libs-dev list.]
>>> 
>>> 
>>> We expect that using random numbers in parallel Stream computations
>>> will be common. (We know it is common in parallel computing in
>>> general.) But we had left support for it in an unsatisfactory state.
>>> If you want to create a stream of random numbers to drive a parallel
>>> computation, you'd choose among two options, neither of them providing
>>> what you probably want: (1) Use a stream based on a single shared
>>> java.util.Random object, in which case your program will encounter
>>> stunning slowdowns when run with many cores; or (2) Use a stream based
>>> on ThreadLocalRandom, which avoids contention, but gives you no
>>> control over the use or properties of the per-thread singleton Random
>>> object. While the ThreadLocalRandom option is great for many purposes,
>>> you wouldn't want to use it in, say, a high-quality Monte Carlo
>>> simulation.
>> 
>> 
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
> 
>