Stream, spliterator, supplier and size

Fri Dec 14 08:06:55 PST 2012

On 12/14/2012 04:37 PM, Brian Goetz wrote:
> By the way, it used to work as you suggest.  Writing spliterators for 
> collections like ArrayList was very painful, because you still had to 
> check that the collection hadn't been modified since the spliterator 
> was captured (otherwise you might be iterating the wrong array.)  
> Which meant that you had to write a proxy spliterator which did checks 
> the first time iterator() / split() / forEach() were called, before 
> passing control onto the array spliterator.  This was messy.  Now, we 
> just pass
>
>   () -> Arrays.spliterator(array, 0, size)
>
> to stream() and we're done!
>
> So the alternative formulation is actually much worse for anyone who 
> wants to make a Stream if they are wrapping it around a mutable 
> collection.
>
> Also we have stream(Spliterator) sitting right next to it so if you 
> are ready to bind immediately, you can.

It was painful because you try to implement the semantics 2 but takes 
the spliterator when creating the stream.
You can also choose the semantics (1) and have no implementation issue.

I'm not married with one of the semantics, but I think this should be clear.

On 12/14/2012 04:24 PM, Brian Goetz wrote:
>
> But I don't think we've exposed them to the users!  The stream() 
> methods are not for users, they are for library writers to implement 
> stream-producing methods.  If users are using them then that means 
> we've forgotten to provide something else.  (Writing Iterators is a 
> pain the neck, much more so.  But again, most users don't write 
> iterators.)

Library writers are users. Everything, we will expose will be used.

>
> Your other argument about "we should bind to the data earlier" is a 
> reasonable thing to discuss from a semantic perspective, but not 
> because the API for making streams is too hard.
>

It's not that the API is too hard to use, it's that it exposes two 
different semantics.

Rémi

>
> On 12/14/2012 9:39 AM, Remi Forax wrote:
>> Brian explains why there is methods in Streams that takes a Supplier and
>> the flags in a previous mail
>> (I'm too lazy to find it now).
>>    <T> Stream<T> stream(Supplier<Spliterator<T>> supplier, int flags)
>>
>> I've trouble to understand why we need to expose two semantics to our
>> poor users,
>> I think it's better to decide whenever (1) the spliterator is created
>> when collection.stream() is called
>> or (2) the spliterator is created when a terminal operation like
>> stream.forEach is called.
>>
>> It has severe implications on the way the pipeline works under the hood
>> because the pipeline ops may relies on the size of the collection which
>> may be different if the collection is mutated between the creation of
>> the stream and the call to the terminal operation.
>>
>> cheers,
>> Rémi
>>