To Stream.slice(fromInclusive, toExclusive) or Stream.slice(toSkip, limit) that is the question

Joe Bowbeer joe.bowbeer at gmail.com
Fri Oct 11 10:31:19 PDT 2013


Concerning the index requirement of these operations,

I think skip is implicitly indexed (start is an index), whereas limit is
not.

So any operation fused with skip is also implicitly indexed, right?

If you have a non-indexed stream, why would you ever
skip(start).limit(count) instead of simply limit(count)?

To be clear where I stand: I think Java would benefit from slice(start,
end)


On Fri, Oct 11, 2013 at 8:48 AM, Joe Bowbeer <joe.bowbeer at gmail.com> wrote:

> slice *is* start:end
>
> The comparison with all the other slices can't be avoided.
>
> If we want skipAndLimit, then let's add one of those.
>
> Our not.
> On Oct 11, 2013 7:54 AM, "Brian Goetz" <brian.goetz at oracle.com> wrote:
>
>> Further argument for dropping: while we don't do implicit op fusion now
>> (an earlier version had this, but we dropped it as it was getting in the
>> way of some more valuable things), now that the implementation has
>> stabilized, it looks like the current implementation is pretty amenable to
>> pairwise op fusion.  So it seems likely there's a path to having the
>> implementation just turn skip+limit into the more efficient version.
>>
>> On 10/11/2013 10:31 AM, Doug Lea wrote:
>>
>>> On 10/11/2013 10:18 AM, Brian Goetz wrote:
>>>
>>>  Several have pointed out that it is surprising that .skip(n).limit(k) is
>>>> inefficient enough to want a fusing.  Of course, in the sequential
>>>> case, its
>>>> fine.  But because skip/limit are constrained to operate in encounter
>>>> order, in
>>>> the worst case (non-SIZED+SUBSIZED, non-UNORDERED), we have to
>>>> buffer.  Doing
>>>> two rounds of buffering would suck twice -- and this was the primary
>>>> motivation
>>>> for a fused operation.
>>>>
>>>> So I think there are two sensible choices here:
>>>>
>>>>   - slice(toSkip, toLimit)
>>>>   - drop slice entirely
>>>>
>>>> The cost of the latter is that those who need it in parallel in the
>>>> unpleasant
>>>> cases are even more likely to have to retreat to sequential.
>>>>
>>>>
>>> In other words: If running sequentially, skip(n).limit(k) costs almost
>>> nothing more than slice. And the same nearly holds for cases where
>>> people do an intrinsically sequential operation in parallel
>>> (i.e., using either skip or limit on ordered data) --
>>> you get about twice the terrible cost of either skip or limit.
>>>
>>> Seems like a fine idea to just drop it.
>>>
>>> -Doug
>>>
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20131011/65cce0da/attachment-0001.html 


More information about the lambda-libs-spec-experts mailing list