alternative implementations of Iterable and MapStream

Wed May 16 14:46:38 PDT 2012

The first naive implementation, that I scraped, was based on a doubly-linked 
list of pipe nodes that was built as a result of composing lazy operations 
(.map().filter().sorted()...). There was the main driving loop at the head of 
the this same list that iterated over source elements and pushed them down the 
pipe. The drawbacks of this approach were:

1. the state that was maintained in the buffering nodes (.sorted()) prevented 
multiple concurrent (not necessarily multi threaded) iterations (the pipe was 
"occupied" by a single iteration).
2. half-built pipe could not be used as a "prefix" for building multiple 
different continuations (each parent node could only have a single child)
3. the approach to obtain "first" result was to issue a "long-break" out of 
driving loop using a sub-type of RuntimeException.

Now my current approach is on one hand very similar to the approach with 
Iterables. The building of a chain of "Producables" is similar to building a 
chain of Iterables - at each step the child Producable is created that has a 
reference to the parent - single parent can have multiple children - drawback 
#2 eliminated.

The Producable is a factory for Producers (like Iterable is for Iterators).

The main Producable API is:

public abstract class Producable<T> implements Iterable<T>
{
   public Producer producer(Transformer<? super T> downstream);
   public Producer producer(Consumer<? super T> consumer);
}

If you have a chain of:

producable1 <- producable2 <- ... <- producableN

then a call to producableN.producer(consumer) creates a reverse chain of:

producer1 -> transformer2 -> ... -> transformerN -> consumer

and returns head producer1. So each call to producer() starts an independent 
producing session (eliminated drawback #1).

A Transformer is a Producer and a Consumer:

public interface Producer
{
   /**
    * Try to produce one piece of output. One call to this method produces at most one piece of output,
    * but need not produce any output.
    *
    * @return true if this producer has more output to produce or false if it has exhausted all output
    */
   boolean produce();
}

public interface Consumer<T>
{
   /**
    * Consumes one object.
    *
    * @param t the object to consume
    */
   void consume(T t);
}

public interface Transformer<T> extends Consumer<T>, Producer
{
   /**
    * @return true if this transformer is in a state that allows consuming input so that the
    *         next call to {@link #consume} will not throw {@link IllegalStateException}.
    */
   boolean canConsume();

   /**
    * @param t the object to consume
    * @throws IllegalStateException if this transformer is in a state that doesn't allow consuming
    */
   @Override
   void consume(T t) throws IllegalStateException;
}

This way it's easy to set-up a context where you have a head producer and a 
tail consumer accessible inside a particular custom producing loop so you can 
control how much you produce and when to stop producing (until first result or 
until second or until first that matches, etc.) - eliminated drawback #3

The Producer/Consumer/Transformer API allows piecewise producing in spite of 
Transformers that compress (.filter()), expand (.flatMap()) or buffer the entire 
stream (.sorted()).

The only remaining drawback is that this approach is different than the known 
approach using Iterable API, so it is maybe harder to understand. But this 
could be internal API that is not exposed to the client and it has this single 
advantage or ability to "push" down tuples of values without boxing (a method 
can have multiple parameters but only a single result).

Regards, Peter

On Tuesday, May 15, 2012 02:59:43 PM Brian Goetz wrote:
> We toyed with this as well and had some concerns about the approach.
> Since you claim this approach "has no remaining drawbacks", can you
> outline the list of drawbacks you started with?
> 
> On 5/11/2012 1:00 PM, Peter Levart wrote:
> > Hello lambdanians,
> > 
> > I toyed a little more with the idea of using "push" processing instead of
> > traditional "pull" that is used in current implementations of stream
> > transformation methods of Iterable and MapStream.
> > 
> > It turned out that it can be done and that this approach has a benefit of
> > avoiding box/unbox-ing of pairs of values into/out of BiValue while
> > actualy
> > having no drawbacks (I managed to elliminate all drawbacks of my 1st
> > implementation from a week ago).
> > 
> > If you are interested, have a look at:
> > 
> > https://github.com/plevart/PushPipes
> > 
> > Basic classes are:
> > 
> > Producable - which implements Iterable (only as a facade)
> > MapProducable - which implements MapStream (only as a facade)
> > 
> > both classes override all defenders from Iterable and MapStream with their
> > own "push" implementations.
> > 
> > Currently Producable/MapProducable are abstract classes, but they could
> > easily be transformed into interfaces with defender methods.
> > 
> > 
> > Regards,
> > Peter