Zipping Steams...

Sam Pullara spullara at gmail.com
Sun Jan 12 16:15:58 PST 2014


Something like this I think would work pretty well. Would love a suggestion to get rid of the side-effect in tryAdvance. This might even parallelize?

Sam

package spullara.util;

import java.util.Spliterator;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.function.Consumer;
import java.util.stream.Stream;
import java.util.stream.StreamSupport;

/**
 * Zip two streams as efficiently as we can?
 */
public class Zipper {
    public static class Pair<L, R>  {
        final L _1;
        final R _2;

        public Pair(L l, R r) {
            _1 = l;
            _2 = r;
        }
    }

    public static <L, R> Stream<Pair<L, R>> zip(Stream<L> left, Stream<R> right) {
        Spliterator<L> lsplit = left.spliterator();
        Spliterator<R> rsplit = right.spliterator();
        Spliterator<Pair<L, R>> split = new PairSpliterator<>(lsplit, rsplit);
        return StreamSupport.stream(split, false);
    }

    private static class PairSpliterator<L, R> implements Spliterator<Pair<L, R>> {
        private final Spliterator<L> lsplit;
        private final Spliterator<R> rsplit;

        public PairSpliterator(Spliterator<L> lsplit, Spliterator<R> rsplit) {
            this.lsplit = lsplit;
            this.rsplit = rsplit;
        }

        @Override
        public boolean tryAdvance(Consumer<? super Pair<L, R>> action) {
            AtomicBoolean advance = new AtomicBoolean();
            lsplit.tryAdvance(l -> {
                boolean b = rsplit.tryAdvance(r -> {
                    action.accept(new Pair<>(l, r));
                });
                advance.set(b);
            });
            return advance.get();
        }

        @Override
        public Spliterator<Pair<L, R>> trySplit() {
            Spliterator<L> lSpliterator = lsplit.trySplit();
            Spliterator<R> rSpliterator = rsplit.trySplit();
            return lSpliterator == null ? null : rSpliterator == null ? null : new PairSpliterator<>(lSpliterator, rSpliterator);
        }

        @Override
        public long estimateSize() {
            return lsplit.estimateSize();
        }

        @Override
        public int characteristics() {
            return lsplit.characteristics() & rsplit.characteristics();
        }
    }
}

On Jan 12, 2014, at 3:58 PM, Brent Walker <brenthwalker at gmail.com> wrote:

> Suppose I wanted to implement Haskell's or ML's zip() function for streams
> (assume I have my own Pair<U, T> class).  One easily comes up with the
> following function:
> 
> public static <T, U> Stream<Pair<T, U>> zip(final Stream<T> ts, final
> Stream<U> us) {
>    @SuppressWarnings("unchecked")
>    final T[] tsVec = (T[]) ts.toArray();
> 
>    @SuppressWarnings("unchecked")
>    final U[] usVec = (U[]) us.toArray();
> 
>    final int siz = Math.min(tsVec.length, usVec.length);
> 
>    return IntStream.range(0, siz).mapToObj(i -> Pair.create(tsVec[i],
> usVec[i]));
>  }
> 
> This works fine for my needs but as a general routine this function has
> issues.  First, it doesn't work at all for infinite streams, and second,
> there are those intermediate data structures it builds (tsVec, and usVec)
> in order to function which are wasteful (at least they look wasteful).
> 
> Is there a better way to implement zip that avoids the above two problems?
> 
> Thanks for any suggestions/ideas etc.
> 
> Brent
> 



More information about the lambda-dev mailing list