To Stream.slice(fromInclusive, toExclusive) or Stream.slice(toSkip, limit) that is the question

Mon Oct 14 03:15:15 PDT 2013

Pagination in the form that you've stated is highly discouraged in RDBMS
world.

First the expression

input.skip((pageNo-1)*pageSize).limit(pageSize)

will almost always be in the form

input.sorted(Comparator.comparing(Element::getKey))
    .skip((pageNo-1)*pageSize)
    .limit(pageSize)

And second, the above form is discouraged too and the following form should
be used instead:

input.sorted(Comparator.comparing(Element::getKey))
    .filter(e ->e .getKey() > lastElementOnPreviousPage.getKey())
    .limit(pageSize)

I think last expression can be made more efficient with operation
reordering in in Stream's case:

input.filter(e ->e .getKey() > lastElementOnPreviousPage.getKey())
    .sortedAndLimited(Comparator.comparing(Element::getKey), pageSize)

"Filter" will be parallelized. And there can be efficient enough fused sort
and limit operation.

--
Victor Nazarov

On Fri, Oct 11, 2013 at 11:10 PM, David M. Lloyd <david.lloyd at redhat.com>wrote:

> On 10/11/2013 12:34 PM, Brian Goetz wrote:
>
>> If you have a non-indexed stream, why would you ever
>>> skip(start).limit(count) instead of simply limit(count)?
>>>
>>
>> Here's one use case: Paging.  You've got a stream of results, and you
>> want to display them by pages.
>>
>>    Stream results = input.skip((pageNo-1)*pageSize).limit(pageSize);
>>
>
> A very similar problem (and solution) exists in the RDBMS world; though
> (IIRC) there is no standardized mechanism, just about every major SQL RDBMS
> supports a (conceptually) very similar construct for choosing a subset of
> results, for very similar use cases (AFAICT).  Take that for what it's
> worth.
>
>
> --
> - DML
>