Zipping Steams...
Sam Pullara
spullara at gmail.com
Sun Jan 12 16:15:58 PST 2014
Something like this I think would work pretty well. Would love a suggestion to get rid of the side-effect in tryAdvance. This might even parallelize?
Sam
package spullara.util;
import java.util.Spliterator;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.function.Consumer;
import java.util.stream.Stream;
import java.util.stream.StreamSupport;
/**
* Zip two streams as efficiently as we can?
*/
public class Zipper {
public static class Pair<L, R> {
final L _1;
final R _2;
public Pair(L l, R r) {
_1 = l;
_2 = r;
}
}
public static <L, R> Stream<Pair<L, R>> zip(Stream<L> left, Stream<R> right) {
Spliterator<L> lsplit = left.spliterator();
Spliterator<R> rsplit = right.spliterator();
Spliterator<Pair<L, R>> split = new PairSpliterator<>(lsplit, rsplit);
return StreamSupport.stream(split, false);
}
private static class PairSpliterator<L, R> implements Spliterator<Pair<L, R>> {
private final Spliterator<L> lsplit;
private final Spliterator<R> rsplit;
public PairSpliterator(Spliterator<L> lsplit, Spliterator<R> rsplit) {
this.lsplit = lsplit;
this.rsplit = rsplit;
}
@Override
public boolean tryAdvance(Consumer<? super Pair<L, R>> action) {
AtomicBoolean advance = new AtomicBoolean();
lsplit.tryAdvance(l -> {
boolean b = rsplit.tryAdvance(r -> {
action.accept(new Pair<>(l, r));
});
advance.set(b);
});
return advance.get();
}
@Override
public Spliterator<Pair<L, R>> trySplit() {
Spliterator<L> lSpliterator = lsplit.trySplit();
Spliterator<R> rSpliterator = rsplit.trySplit();
return lSpliterator == null ? null : rSpliterator == null ? null : new PairSpliterator<>(lSpliterator, rSpliterator);
}
@Override
public long estimateSize() {
return lsplit.estimateSize();
}
@Override
public int characteristics() {
return lsplit.characteristics() & rsplit.characteristics();
}
}
}
On Jan 12, 2014, at 3:58 PM, Brent Walker <brenthwalker at gmail.com> wrote:
> Suppose I wanted to implement Haskell's or ML's zip() function for streams
> (assume I have my own Pair<U, T> class). One easily comes up with the
> following function:
>
> public static <T, U> Stream<Pair<T, U>> zip(final Stream<T> ts, final
> Stream<U> us) {
> @SuppressWarnings("unchecked")
> final T[] tsVec = (T[]) ts.toArray();
>
> @SuppressWarnings("unchecked")
> final U[] usVec = (U[]) us.toArray();
>
> final int siz = Math.min(tsVec.length, usVec.length);
>
> return IntStream.range(0, siz).mapToObj(i -> Pair.create(tsVec[i],
> usVec[i]));
> }
>
> This works fine for my needs but as a general routine this function has
> issues. First, it doesn't work at all for infinite streams, and second,
> there are those intermediate data structures it builds (tsVec, and usVec)
> in order to function which are wasteful (at least they look wasteful).
>
> Is there a better way to implement zip that avoids the above two problems?
>
> Thanks for any suggestions/ideas etc.
>
> Brent
>
More information about the lambda-dev
mailing list