Stream API: Fuse sorted().limit(n) into single operation

Louis Wasserman lowasser at
Sat Mar 5 18:30:26 UTC 2016

Worth noting: Guava uses a similar implementation for Ordering.leastOf
but instead of sorting the array when it's filled, does a quickselect pass
to do it in O(k) time instead of O(k log k).

We had been planning to put together a Collector implementation for it,
since it's actually pretty amenable to Collectorification (clearly a word).

On Sat, Mar 5, 2016 at 12:32 PM Tagir F. Valeev <amaembo at> wrote:

> Hello!
> One of the popular bulk data operation is to find given number of
> least or greatest elements. Currently Stream API provides no dedicated
> operation to do this. Of course, it could be implemented by custom
> collector and some third-party libraries already provide it. However
> it would be quite natural to use existing API:
> stream.sorted().limit(k) - k least elements
> stream.sorted(Comparator.reverseOrder()).limit(k) - k greatest elements.
> In fact people already doing this. Some samples could be found on
> GitHub:
> Unfortunately current implementation of such sequence of operations is
> suboptimal: first the whole stream content is dumped into intermediate
> array, then sorted fully and after that k least elements is selected.
> On the other hand it's possible to provide a special implementation
> for this particular case which takes O(k) additional memory and in
> many cases works significantly faster.
> I wrote proof-of-concept implementation, which could be found here:
> The implementation switches to new algorithm if limit is less than
> 1000 which is quite common for such scenario (supporting bigger values
> is also possible, but would require more testing). New algorithm
> allocates an array of 2*limit elements. When its size is reached, it
> sorts the array (using Arrays.sort) and discards the second half.
> After that only those elements are accumulated which are less than the
> worst element found so far. When array is filled again, the second
> half is sorted and merged with the first half.
> Here's JMH test with results which covers several input patterns:
> You may check summary first:
> Speedup values bigger than 1 are good.
> The most significant regression in the sequential mode of the new
> implementation is the ever decreasing input (especially with the low
> limit value). Still, it's not that bad (given the fact that old
> implementation processes such input very fast). On the other hand, for
> random input new implementation could be in order of magnitude faster.
> Even for ever ascending input noteable speedup (like 40%) could be
> achieved.
> For parallel stream the new implementation is almost always faster,
> especially if you ignore the cases when parallel stream is
> unprofitable.
> What do you think about this improvement? Could it be included into
> JDK-9? Are there any issues I'm unaware of? I would be really happy to
> complete this work if this is supported by JDK team. Current
> implementation has no primitive specialization and does not optimize
> the sorting out if the input is known to be sorted, but it's not very
> hard to add these features as well if you find my idea useful.
> With best regards,
> Tagir Valeev.

More information about the core-libs-dev mailing list