Streams design strawman

Sun Apr 22 15:27:57 PDT 2012

My point was more, it's a supertype of Map like Iterable is a supertype 
of Collection.

Rémi

On 04/22/2012 07:04 PM, Brian Goetz wrote:
> The more general design principle that we were appealing to is: 
> collections are all about storing values, and the set of operations 
> you have to support on a collection is large.  But it is silly to use 
> a collection as the intermediate value between every operation -- that 
> is wasteful.  For example, we could have had filter and map return new 
> collections, and written things like this:
>
>   Collection<Name> filtered = names.filter(...);
>   Collection<String> mapped = names.map(n -> n.getLastName());
>   mapped.sort(...);
>
> But creating the intermediate collections is usually wasteful.  So 
> instead, filter/map return streams:
>
>   SortedSet<String> result = names.filter(...)
>                                   .map(Name::getLastName)
>                                   .into(new SortedSet<>());
>
> Which gives the same final result, but more efficiently and (IMO) more 
> cleanly.
>
> The key observation is: most bulk operations on collections can be 
> expressed in the form
>
>   source - lazy - lazy - lazy - eager
>
> where the "eager" operations are things like forEach, dump the results 
> into a collection, or some form of reduce.
>
> Grouping might sometimes be the last element in the processing, but 
> very often we want to keep going.  Expressing it as something that 
> produces a stream makes it easier to keep going.  Grouping may benefit 
> less from laziness than filtering, but treating it as a lazy 
> (stream-producing) operation also has benefits.
>
> Our model is that the methods that produce new streams can be lazy, 
> and those that produce concrete results (scalars, collections, etc) 
> are eager.
>
>
> On 4/22/2012 12:55 PM, Brian Goetz wrote:
>>> So basically it's not a stream but something like this:
>>>
>>> interface Histogram<K,V> {
>>> Iterable<K> keys();
>>> Iterable<V> values();
>>> Iterable<Entry<K,V>> entries();
>>> }
>>>
>>> a kind of super type of a Map.
>>
>> It certainly could be, if we wanted to make it an eager
>> (end-of-stream-pipeline) operation. But it seems more flexible to make
>> it a BiStream-creating operation (even though the values need to be
>> internally buffered, which I think is your underlying point), because
>> then you can keep going with more transformations / reductions on the
>> resulting BiStream. For example, the following produces a Map<Integer,
>> String>, where the keys are word lengths and the values are strings of
>> "word,word,word".
>>
>> words.groupBy(w -> w.length())
>> .mapValues((length, words) -> String.join(words))
>> .into(new HashMap<Integer, String>);
>>
>> The group-by operation is rarely the end of what you want to do; usually
>> you want to count, post-process, etc.
>>