Additional method on Stream

older
Unicode command-line parameters on...

Stephen Colebourne

27 Apr 2015 27 Apr '15

1:22 p.m.

This is a request for an additional method on java.util.stream.Stream. Having written quite a bit of code in Java 8 now, one annoyance keeps on coming up, filtering and casting the stream. Currently, it is necessary to do this: return input.stream() .filter(Foo.class::isInstance) .map(Foo.class::cast) .someTerminalOperation(); or return input.stream() .filter(t -> t instanceof Foo) .map(t -> (Foo) t) .someTerminalOperation(); Really, what is needed is this: return input.stream() .filter(Foo.class) .someTerminalOperation(); For info, Guava's FluentIterable has such a method. The new method signature would be something like: public Stream<R> filter(Class<R> cls); As far as I can see, there is no problem in implementing this in both serial and parallel modes, as it is essentially just a convenience. Thoughts? Stephen

Show replies by date

Paul Sandoz

27 Apr 27 Apr

1:41 p.m.

Hi Stephen, You can do this: static <T> Function<Object, Stream<T>> casting(Class<T> c) { // bike shed for clearer name return o -> Stream.ofNullable(c.isInstance(o) ? c.cast(o) : null); } Object[] s = Stream.of(1, 2, "3", 4).toArray(); Stream.of(s).flatMap(casting(Integer.class)). forEach(System.out::println); I am a bit reluctant to add such a specific kind of filter method to Stream when one can do the above. In general my preference is to keep the stream operation methods as general as possible. Paul. On Apr 27, 2015, at 3:22 PM, Stephen Colebourne <scolebourne@joda.org> wrote:

...

This is a request for an additional method on java.util.stream.Stream.

Having written quite a bit of code in Java 8 now, one annoyance keeps on coming up, filtering and casting the stream.

Currently, it is necessary to do this:

return input.stream() .filter(Foo.class::isInstance) .map(Foo.class::cast) .someTerminalOperation();

or

return input.stream() .filter(t -> t instanceof Foo) .map(t -> (Foo) t) .someTerminalOperation();

Really, what is needed is this:

return input.stream() .filter(Foo.class) .someTerminalOperation();

For info, Guava's FluentIterable has such a method.

The new method signature would be something like:

public Stream<R> filter(Class<R> cls);

As far as I can see, there is no problem in implementing this in both serial and parallel modes, as it is essentially just a convenience.

Thoughts? Stephen

Stephen Colebourne

2:56 p.m.

Obviously, this is yet another possible workaround. But it is a workaround. There really aren't that many rough edges with the set of methods added with lambdas, but this is definitely one. That Guava handled it specially is another good indication. BTW, I wait months before making this request to see if it really was common enough a pattern, but I'm confident that it is now. Stephen On 27 April 2015 at 14:41, Paul Sandoz <paul.sandoz@oracle.com> wrote:

...

Hi Stephen,

You can do this:

static <T> Function<Object, Stream<T>> casting(Class<T> c) { // bike shed for clearer name return o -> Stream.ofNullable(c.isInstance(o) ? c.cast(o) : null); }

Object[] s = Stream.of(1, 2, "3", 4).toArray(); Stream.of(s).flatMap(casting(Integer.class)). forEach(System.out::println);

I am a bit reluctant to add such a specific kind of filter method to Stream when one can do the above.

In general my preference is to keep the stream operation methods as general as possible.

Paul.

On Apr 27, 2015, at 3:22 PM, Stephen Colebourne <scolebourne@joda.org> wrote:

...
This is a request for an additional method on java.util.stream.Stream.

Having written quite a bit of code in Java 8 now, one annoyance keeps on coming up, filtering and casting the stream.

Currently, it is necessary to do this:

return input.stream() .filter(Foo.class::isInstance) .map(Foo.class::cast) .someTerminalOperation();

or

return input.stream() .filter(t -> t instanceof Foo) .map(t -> (Foo) t) .someTerminalOperation();

Really, what is needed is this:

return input.stream() .filter(Foo.class) .someTerminalOperation();

For info, Guava's FluentIterable has such a method.

The new method signature would be something like:

public Stream<R> filter(Class<R> cls);

As far as I can see, there is no problem in implementing this in both serial and parallel modes, as it is essentially just a convenience.

Thoughts? Stephen

Paul Sandoz

3:23 p.m.

On Apr 27, 2015, at 4:56 PM, Stephen Colebourne <scolebourne@joda.org> wrote:

...

Obviously, this is yet another possible workaround. But it is a workaround.

I don't consider it "just a workaround" :-)

...

There really aren't that many rough edges with the set of methods added with lambdas, but this is definitely one. That Guava handled it specially is another good indication.

Tis conjecture, but perhaps it might have been different in post-lambda world? One issue is there are zillions of possible more specific convenience operations we could add. Everyone has their own favourite. Some static methods were recently added to Stream and Optional in preference to such operations. There has to be a really good reason to add new operations. I realize this use-case might be more common than others but i am still yet to be convinced that it has sufficient weight given flatMap + lambda + static method.

...

BTW, I wait months before making this request to see if it really was common enough a pattern, but I'm confident that it is now.

Were you aware of the pattern using flatMap during those months? Paul.

Peter Levart

7:04 p.m.

On 04/27/2015 05:23 PM, Paul Sandoz wrote:

...

On Apr 27, 2015, at 4:56 PM, Stephen Colebourne <scolebourne@joda.org> wrote:

...
Obviously, this is yet another possible workaround. But it is a workaround. I don't consider it "just a workaround" :-)

...
There really aren't that many rough edges with the set of methods added with lambdas, but this is definitely one. That Guava handled it specially is another good indication.

Tis conjecture, but perhaps it might have been different in post-lambda world?

One issue is there are zillions of possible more specific convenience operations we could add. Everyone has their own favourite. Some static methods were recently added to Stream and Optional in preference to such operations.

There has to be a really good reason to add new operations. I realize this use-case might be more common than others but i am still yet to be convinced that it has sufficient weight given flatMap + lambda + static method.

One reason might be that the workaround creates at least two new objects per included element of the stream and the overhead involved for executing the flat-map logic. A more general operation might be something like the following: /** * Returns a stream consisting of the non-null results of applying the given * function to the elements of this stream. */ <R> Stream<R> filterMap(Function<? super T, ? extends R> mapper); Stephen's example would then read: return input.stream() .filterMap(t -> t instanceof Foo ? (Foo) t : null) .someTerminalOperation(); Combining filtering and mapping in one operation might often be desirable to avoid duplicate work (for example when filtering and mapping needs to compute some common but costly intermediate result for each element). flatMap is admittedly suitable for that too, but has it's overhead. At what per-operation cost this overhead pays-off can be seen at the end... I know that null values were a controversial topic when this API was being designed and that the decision was made to basically "ignore" their presence in stream elements. So making null part of the API contract might be out of the question right? So what about Optional? Could it be used to make flatMap a little more efficient for the combined filter/map case? For example, could the following composition be written in a more concise way? input.stream() .map(t -> t instanceof Foo ? Optional.of((Foo) t) : Optional.empty()) .filter(Optional::isPresent) .map(Optional::get) Maybe with operation like: /** * Returns a stream consisting of the "present" unwrapped results of applying the given * function to the elements of this stream. */ <R> Stream<R> mapOptionally(Function<? super T, Optional<? extends R>> mapper); But that's not what Stephen would like to see, and I personally don't mind being a little more verbose if it makes code execute faster. I would be pretty confident writing the following: input.stream() .map(t -> t instanceof Foo ? (Foo)t : null) .filter(f -> f != null) To quantify the overheads involved with various approaches, I created a little benchmark that shows the following results: Benchmark (opCost) Mode Samples Score Score error Units j.t.StreamBench.filterThenMap 0 avgt 10 1.186 0.010 ms/op j.t.StreamBench.filterThenMap 10 avgt 10 2.642 0.205 ms/op j.t.StreamBench.filterThenMap 20 avgt 10 5.254 0.011 ms/op j.t.StreamBench.filterThenMap 30 avgt 10 8.187 0.165 ms/op j.t.StreamBench.filterThenMap 40 avgt 10 11.525 0.295 ms/op j.t.StreamBench.flatMap 0 avgt 10 2.015 0.188 ms/op j.t.StreamBench.flatMap 10 avgt 10 3.287 0.224 ms/op j.t.StreamBench.flatMap 20 avgt 10 5.275 0.638 ms/op j.t.StreamBench.flatMap 30 avgt 10 7.033 0.209 ms/op j.t.StreamBench.flatMap 40 avgt 10 9.146 0.281 ms/op j.t.StreamBench.mapToNullable 0 avgt 10 1.185 0.006 ms/op j.t.StreamBench.mapToNullable 10 avgt 10 2.120 0.392 ms/op j.t.StreamBench.mapToNullable 20 avgt 10 3.677 0.210 ms/op j.t.StreamBench.mapToNullable 30 avgt 10 5.526 0.126 ms/op j.t.StreamBench.mapToNullable 40 avgt 10 7.884 0.202 ms/op j.t.StreamBench.mapToOptional 0 avgt 10 1.144 0.121 ms/op j.t.StreamBench.mapToOptional 10 avgt 10 2.322 0.146 ms/op j.t.StreamBench.mapToOptional 20 avgt 10 4.371 0.270 ms/op j.t.StreamBench.mapToOptional 30 avgt 10 6.215 0.536 ms/op j.t.StreamBench.mapToOptional 40 avgt 10 8.471 0.554 ms/op Comparing .filter(op).map(op) with .flatMap(op) where each operation has it's cost, we see there is a tripping point at opCost=20 where flatMap() starts to pay off if we can merge the two ops into one with equal cost. But we can also see that flatMap has it's cost too, compared to other two approaches (mapToNullable/mapToOptional) which is most obvious when the operation cost is low. So the conclusion? No, I don't think we need a new Stream method. I just wanted to show that flatMap() is maybe the most universal but not always the best (fastest) answer for each problem. Regards, Peter P.S. The benchmark source: package jdk.test; import org.openjdk.jmh.annotations.*; import org.openjdk.jmh.infra.Blackhole; import java.util.ArrayList; import java.util.List; import java.util.Optional; import java.util.concurrent.ThreadLocalRandom; import java.util.concurrent.TimeUnit; import java.util.function.Function; import java.util.function.Predicate; import java.util.stream.Stream; /** * Created by peter on 4/27/15. */ @BenchmarkMode(Mode.AverageTime) @Fork(value = 1, warmups = 0) @Warmup(iterations = 5) @Measurement(iterations = 10) @OutputTimeUnit(TimeUnit.MILLISECONDS) @State(Scope.Benchmark) public class StreamBench { @Param({"0", "10", "20", "30", "40"}) public long opCost; List<Object> objects; @Setup public void setup() { objects = new ArrayList<>(100000); ThreadLocalRandom tlr = ThreadLocalRandom.current(); for (int i = 0; i < 100000; i++) { objects.add(tlr.nextBoolean() ? "123" : 123); } } <F, T> Function<F, T> withCost(Function<F, T> function) { return f -> { Blackhole.consumeCPU(opCost); return function.apply(f); }; } <T> Predicate<T> withCost(Predicate<T> predicate) { return t -> { Blackhole.consumeCPU(opCost); return predicate.test(t); }; } @Benchmark public long filterThenMap() { return objects.stream() .filter(withCost((Object o) -> o instanceof String)) .map(withCost((Object o) -> (String) o)) .count(); } @Benchmark public long flatMap() { return objects.stream() .flatMap(withCost((Object o) -> o instanceof String ? Stream.of((String) o) : Stream.empty())) .count(); } @Benchmark public long mapToOptional() { return objects.stream() .map(withCost((Object o) -> o instanceof String ? Optional.of((String) o) : Optional.empty())) .filter(Optional::isPresent) .map(Optional::get) .count(); } @Benchmark public long mapToNullable() { return objects.stream() .map(withCost((Object o) -> o instanceof String ? (String) o : null)) .filter(s -> s != null) .count(); } }

...

...
BTW, I wait months before making this request to see if it really was common enough a pattern, but I'm confident that it is now.

Were you aware of the pattern using flatMap during those months?

Paul.

Paul Sandoz

28 Apr 28 Apr

10:57 a.m.

Hi Peter, You are correct in stating that flatMap has some overhead. There are optimizations in place for operating on one element and on the head of the stream that reduce the overhead. Escape analysis sometimes works, it would be nice if that were more reliable, and of course similar things apply to creation of Optional instances. I hope that will improve post 9. There are also much more important performance issues, especially related to profile pollution, that we may need to solve via loop specialization. Ideally i really want to avoid explicit fusing of operations in the API for performance reasons. It should be up to the implementation/compiler to work that out. It's helpful to quantify as you have done for the performance sensitive of us. We did ponder those methods you suggested. Note that the primitive variants also need to be considered too. The first, filterMap, is not value type friendly. Rather than adding the second, mapOptionally, we added Optional.stream to be used with flatMap :-) Paul.

Paul Sandoz

12:30 p.m.

New subject: flatMap performance <was> Re: Additional method on Stream

On Apr 28, 2015, at 12:57 PM, Paul Sandoz <Paul.Sandoz@oracle.com> wrote:

...

Hi Peter,

You are correct in stating that flatMap has some overhead.

There are optimizations in place for operating on one element and on the head of the stream that reduce the overhead.

I believe at least in the micro-benchmark cases i can reduce the flatMap time by about 60% with special stream impls for none and one element, rather than levering the stream builder directly. They are not that difficult if one uses a pattern like the following: static final class StreamOfOne<T> implements Stream<T> { boolean consumed; final T t; public StreamOfOne(T t) { this.t = t; } void consumed() { if (consumed) throw new IllegalStateException(); consumed = true; } Stream<T> fork() { consumed(); return StreamSupport.stream(new Streams.StreamBuilderImpl<>(t), false); } ... @Override public Stream<T> filter(Predicate<? super T> predicate) { return fork().filter(predicate); } ... @Override public void forEach(Consumer<? super T> action) { consumed(); action.accept(t); } ... } There is a cost if intermediate operations and certain terminal operations (like collect) are invoked (that cost can be reduced by merging in a Spliterator implementation). For the common cases of simpler terminal operations, with flatMap, it's a win. So I am pondering adding such implementations. Also, FWIW, returning null rather than Stream.empty() is slightly faster. The latter results in more profiling effects. Looking at generated code there is still a lot of "ceremony" that one would think hotspot would just able to do away with given the temporary Stream objects (me of course not understanding the intricate details of the C2 compiler). Paul.

Stephen Colebourne

9:18 a.m.

On 27 April 2015 at 16:23, Paul Sandoz <paul.sandoz@oracle.com> wrote:

...

One issue is there are zillions of possible more specific convenience operations we could add. Everyone has their own favourite. Some static methods were recently added to Stream and Optional in preference to such operations.

There has to be a really good reason to add new operations. I realize this use-case might be more common than others but i am still yet to be convinced that it has sufficient weight given flatMap + lambda + static method.

...
BTW, I wait months before making this request to see if it really was common enough a pattern, but I'm confident that it is now.

Were you aware of the pattern using flatMap during those months?

No, but if I had, I would not have used it. Its a rubbish workaround. It creates extra objects for no value. It involves statically importing another random utility. and it is less expressive - the desire is to filter by type, so I want the filter method. I do understand the desire to control methods, but API design isn't just about minimalism, it is also about meeting common use cases in a natural way. The parallel is of course a standard if (obj instanceof Foo) statement in Java, where developers often curse that they have to additionally cast obj after the check. Why can't the compiler just do it? (Kotlin does for example). While changing the Java language to do this is hard, changing the Stream API would be easy, and good value given a year's coding with streams. Stephen

Paul Sandoz

1:05 p.m.

On Apr 28, 2015, at 11:18 AM, Stephen Colebourne <scolebourne@joda.org> wrote:

...

On 27 April 2015 at 16:23, Paul Sandoz <paul.sandoz@oracle.com> wrote:

...
One issue is there are zillions of possible more specific convenience operations we could add. Everyone has their own favourite. Some static methods were recently added to Stream and Optional in preference to such operations.

There has to be a really good reason to add new operations. I realize this use-case might be more common than others but i am still yet to be convinced that it has sufficient weight given flatMap + lambda + static method.

...
BTW, I wait months before making this request to see if it really was common enough a pattern, but I'm confident that it is now.

Were you aware of the pattern using flatMap during those months?

No, but if I had, I would not have used it. Its a rubbish workaround. It creates extra objects for no value. It involves statically importing another random utility. and it is less expressive - the desire is to filter by type, so I want the filter method.

It's not just a filter method it's a specific, fused filter and map method.

...

I do understand the desire to control methods, but API design isn't just about minimalism, it is also about meeting common use cases in a natural way. The parallel is of course a standard if (obj instanceof Foo) statement in Java, where developers often curse that they have to additionally cast obj after the check.

I believe Ceylon might also do contextual type narrowing. I dunno how easy it would to modify the Java language and javac to do similar things. A separate discussion to be had... A library analogue could be: Optional<T> Class.castIf(Object obj). At the risk of increasing your wrath about flatMap, you could then do: s.flatMap(e -> Foo.castIf(e).stream()) Paul.

...

Why can't the compiler just do it? (Kotlin does for example). While changing the Java language to do this is hard, changing the Stream API would be easy, and good value given a year's coding with streams.

Stephen

Remi Forax

2 May 2 May

2:41 p.m.

On 04/28/2015 03:05 PM, Paul Sandoz wrote: [...]

...

...
I do understand the desire to control methods, but API design isn't just about minimalism, it is also about meeting common use cases in a natural way. The parallel is of course a standard if (obj instanceof Foo) statement in Java, where developers often curse that they have to additionally cast obj after the check. I believe Ceylon might also do contextual type narrowing. I dunno how easy it would to modify the Java language and javac to do similar things. A separate discussion to be had...

I think this was discussed on the coin project mailing list when talking about the precise re-throw of exception. The main issue is that a more precise type will select another overload than the one that javac currently select: class A {} class B extends A {} class Bar { void foo(A a) { ... } void foo(B b) { ... } } Let suppose we have this code: A a = ... if (a instanceof B) { foo(a); } without narrowing, it will call foo(A), with narrowing, it will call foo(B). Rémi

Kasper Nielsen

27 Apr 27 Apr

8:34 p.m.

On Mon, Apr 27, 2015 at 3:22 PM, Stephen Colebourne <scolebourne@joda.org> wrote:

...

This is a request for an additional method on java.util.stream.Stream.

Having written quite a bit of code in Java 8 now, one annoyance keeps on coming up, filtering and casting the stream.

Currently, it is necessary to do this:

return input.stream() .filter(Foo.class::isInstance) .map(Foo.class::cast) .someTerminalOperation();

or

return input.stream() .filter(t -> t instanceof Foo) .map(t -> (Foo) t) .someTerminalOperation();

Really, what is needed is this:

return input.stream() .filter(Foo.class) .someTerminalOperation();

For info, Guava's FluentIterable has such a method.

The new method signature would be something like:

public Stream<R> filter(Class<R> cls);

I second this suggestion. I have thought about suggesting this for a long time as well. As it really bothers me missing that method. I have been using a pre 8.0 stream-like library and in many situations this method has really saved some ugly boiler plating code. The other default function I would like to see is stream.toList() (I can live with collectToList) which is short for s.collect(Collectors.toList()). 50 % of my terminal functions are s.collect(Collectors.toList()). And I took a look at libraries that uses the Stream interface GitHub and it is roughly the same usage. - Kasper

Paul Sandoz

28 Apr 28 Apr

7:16 a.m.

On Apr 27, 2015, at 10:34 PM, Kasper Nielsen <kasperni@gmail.com> wrote:

...

The other default function I would like to see is stream.toList() (I can live with collectToList) which is short for s.collect(Collectors.toList()). 50 % of my terminal functions are s.collect(Collectors.toList()).

Can you live with a static import and: s.collect(toList()) ? which is rather close to "collectToList". When designing j.u,s.Stream we made a conscious decision to not bind it to j.u collection types. A Stream could be integrated with other forms of collections (e.g. GS Collections). Paul.

...

And I took a look at libraries that uses the Stream interface GitHub and it is roughly the same usage.

- Kasper

Kasper Nielsen

8:22 a.m.

On Tue, Apr 28, 2015 at 9:16 AM, Paul Sandoz <paul.sandoz@oracle.com> wrote:

...

On Apr 27, 2015, at 10:34 PM, Kasper Nielsen <kasperni@gmail.com> wrote:

...
The other default function I would like to see is stream.toList() (I can live with collectToList) which is short for s.collect(Collectors.toList()). 50 % of my terminal functions are s.collect(Collectors.toList()).

Can you live with a static import and:

s.collect(toList())

? which is rather close to "collectToList".

When designing j.u,s.Stream we made a conscious decision to not bind it to j.u collection types. A Stream could be integrated with other forms of collections (e.g. GS Collections).

First, if you are using any kind of modern IDE you have some kind of intelligent completion which will suggest collectToList the moment you press the 'c' in stream.c... The other one you have to create an additional import, and invoke a static method (Yes I know most IDEs makes this easy as well) But we are still talking about something like 1 second vs 5 seconds. Second, s.collect(toList()) is just not naturally for most users. Sure readers on this list understands how the collect method works. But I don't think many novice/intermediate users will. Third, Yes there are a lot of different collections that a stream can be integrated with. But we are talking about roughly 50 % of the usage. Fourth, this hasn't actually anything to do with ease of use but performance. But I have a very fast stream implementation where I would like to provide a fast (and easy) way to return the stream elements as a list. This is mainly in situations where I know the number of elements in the result (which is quite often if you don't use filters). By having a toList() method I can implement, I can avoid the array list resizings in s.collect(toList()). This is actually also why I would prefer if it was called toList() and not collectToList() as I think it is implementation detail how the list generation is done. - Kasper

Kasper Nielsen

8:43 a.m.

A couple of blog posts with the same issue. http://winterbe.com/posts/2015/03/05/fixing-java-8-stream-gotchas-with-intel... http://benjiweber.co.uk/blog/2015/03/06/adding-tolist-to-java-streams/ http://javarevisited.blogspot.dk/2015/03/5-ways-to-convert-java-8-stream-to-... It is also on the top 10 questions on stackoverflow on Java 8 http://stackoverflow.com/questions/tagged/java-8 - Kasper On Tue, Apr 28, 2015 at 10:22 AM, Kasper Nielsen <kasperni@gmail.com> wrote:

...

On Tue, Apr 28, 2015 at 9:16 AM, Paul Sandoz <paul.sandoz@oracle.com> wrote:

...
On Apr 27, 2015, at 10:34 PM, Kasper Nielsen <kasperni@gmail.com> wrote:

...
The other default function I would like to see is stream.toList() (I can live with collectToList) which is short for s.collect(Collectors.toList()). 50 % of my terminal functions are s.collect(Collectors.toList()).

Can you live with a static import and:

s.collect(toList())

? which is rather close to "collectToList".

When designing j.u,s.Stream we made a conscious decision to not bind it to j.u collection types. A Stream could be integrated with other forms of collections (e.g. GS Collections).

First, if you are using any kind of modern IDE you have some kind of intelligent completion which will suggest collectToList the moment you press the 'c' in stream.c... The other one you have to create an additional import, and invoke a static method (Yes I know most IDEs makes this easy as well) But we are still talking about something like 1 second vs 5 seconds.

Second, s.collect(toList()) is just not naturally for most users. Sure readers on this list understands how the collect method works. But I don't think many novice/intermediate users will.

Third, Yes there are a lot of different collections that a stream can be integrated with. But we are talking about roughly 50 % of the usage.

Fourth, this hasn't actually anything to do with ease of use but performance. But I have a very fast stream implementation where I would like to provide a fast (and easy) way to return the stream elements as a list. This is mainly in situations where I know the number of elements in the result (which is quite often if you don't use filters). By having a toList() method I can implement, I can avoid the array list resizings in s.collect(toList()). This is actually also why I would prefer if it was called toList() and not collectToList() as I think it is implementation detail how the list generation is done.

- Kasper

Paul Sandoz

9:06 a.m.

On Apr 28, 2015, at 10:22 AM, Kasper Nielsen <kasperni@gmail.com> wrote:

...

On Tue, Apr 28, 2015 at 9:16 AM, Paul Sandoz <paul.sandoz@oracle.com> wrote: On Apr 27, 2015, at 10:34 PM, Kasper Nielsen <kasperni@gmail.com> wrote:

...
The other default function I would like to see is stream.toList() (I can live with collectToList) which is short for s.collect(Collectors.toList()). 50 % of my terminal functions are s.collect(Collectors.toList()).

Can you live with a static import and:

s.collect(toList())

? which is rather close to "collectToList".

When designing j.u,s.Stream we made a conscious decision to not bind it to j.u collection types. A Stream could be integrated with other forms of collections (e.g. GS Collections).

First, if you are using any kind of modern IDE you have some kind of intelligent completion which will suggest collectToList the moment you press the 'c' in stream.c... The other one you have to create an additional import, and invoke a static method (Yes I know most IDEs makes this easy as well) But we are still talking about something like 1 second vs 5 seconds.

Not much in it then :-) perhaps IDEs might even be enhanced able to suggest collect(toList()) upfront? (After writing the above i saw your email with a link to IntelliJ Collector templates, neat!)

...

Second, s.collect(toList()) is just not naturally for most users. Sure readers on this list understands how the collect method works. But I don't think many novice/intermediate users will.

It's a trick that only has to be learnt once.

...

Third, Yes there are a lot of different collections that a stream can be integrated with. But we are talking about roughly 50 % of the usage.

And then someone wants Set, then Map etc. etc. We are also designing Stream for the future where perhaps we might have alternative collection types (not committing to that... :-)) and then someone wants that list type etc. etc.

...

Fourth, this hasn't actually anything to do with ease of use but performance.

Ah, so this is your actual motivation! masquerading in an ease-of-use disguise :-) You raise a good point about the current limitations of Collectors not being able to leverage more information of the pipeline, such as known size.

...

But I have a very fast stream implementation where I would like to provide a fast (and easy) way to return the stream elements as a list. This is mainly in situations where I know the number of elements in the result (which is quite often if you don't use filters). By having a toList() method I can implement, I can avoid the array list resizings in s.collect(toList()). This is actually also why I would prefer if it was called toList() and not collectToList() as I think it is implementation detail how the list generation is done.

This does not mean we need to expose a Stream.toList. How about we improve Collectors instead? [*]. For parallel execution we need to work out how better to merge lists and maps. Propagating size information at the root and for each leaf is important for pre-sizing. We can use growable arrays to avoid resizing. -- Separately, the best way to expose a mechanism where you want to do your own stuff is to provide an operation SPI. We are just not ready to do that for 9 given what post 9 will bring. Paul. [*] We could of course take advantage of internal details, that might be a reasonable short-cut for 9, but it would be nice to solve this via a public API.

3964

Age (days ago)

3969

Last active (days ago)

List overview

Download

14 comments

5 participants

participants (5)

Kasper Nielsen
Paul Sandoz
Peter Levart
Remi Forax
Stephen Colebourne

Additional method on Stream

tags

participants (5)