Stream from Iterable

Brian Goetz brian.goetz at oracle.com
Tue Dec 11 08:45:16 PST 2012


Stepping back...

There are lots of ways to create a Stream.  The more information you 
have about how to describe the elements, the more functionality and 
performance the streams library can give you.  In order of least to most 
information, they are:

Iterator

Iterator + size

Spliterator

Spliterator that knows its size

Spliterator that knows its size, and further knows that all sub-splits 
know their size.

(Some may be surprised to find that we can extract parallelism even from 
a dumb iterator in cases where Q (work per element) is nontrivial.)


If Iterable had a stream() method, it would just wrap an Iterator with a 
Spliterator, with no size information.  But, most things that are 
Iterable *do* have size information.  Which means we're serving up 
deficient streams.  That's not so good.


One downside of the API practice outlined by Stephen here, of accepting 
Iterable instead of Collection, is that you are forcing things through a 
"small pipe" and therefore discarding size information when it might be 
useful.  That's fine if all you're doing to do is forEach it, but if you 
want to do more, its better if you can preserve all the information you 
want.

The default provided by Iterable would be a crappy one indeed -- it 
would discard size even though the vast majority of Iterables do know 
that information.



On 12/11/2012 11:18 AM, Stephen Colebourne wrote:
> The latest code in hg has no stream() method on Iterable. That feels wrong.
>
> I have on a number of occasions defined an API to accept Iterable when
> I don't care what type of collection it is. It is more friendly to
> users that way. Those APIs will have no easy way to transition to
> streams (the code you pasted is not easy).
>
> I'm assuming that the decision to not implement Streamable on Iterable
> was deliberate, just saying that it degardes the usefulness of
> Iterable (and its an interface that the JDK has never fully embraced).
>
> Stephen
>
>
> On 11 December 2012 16:06, Brian Goetz <brian.goetz at oracle.com> wrote:
>> I do not expect this to be a common operation; clients will obtains streams
>> by calling the stream() method on a Streamable.  This technique is a
>> lower-level API for *implementors* of Streamable.  And I think you'll agree
>> that this is far, far less verbose than what you have to do to implement
>> Iterable...
>>
>> The #1 source of Iterables is Collections.  All the Collections have
>> stream() methods; their implementations are one-liners like this one.
>>
>> Actually, this is one of the weaker ways to implement a Stream; if you have
>> a real data structure, you'll probably want to implement Spliterator.  But
>> if all you have is an Iterator, we can turn it into a stream for you.
>>
>>
>> On 12/11/2012 10:27 AM, Stephen Colebourne wrote:
>>>
>>> On 11 December 2012 14:29, Brian Goetz <brian.goetz at oracle.com> wrote:
>>>>
>>>> Try this:
>>>>
>>>> Streams.stream(Streams.spliterator(iter.iterator(), sizeIfKnown),
>>>>                   flags);
>>>>
>>>> You might want ORDERED as a flag.
>>>
>>>
>>> That seems quite verbose for what seems like a common operation...
>>>
>>> Stephen
>>>
>>
>


More information about the lambda-dev mailing list