RFR: JDK-8133680 add Stream.foldLeft() terminal operation

Tagir Valeev amaembo at gmail.com
Sun Aug 20 05:36:04 UTC 2017


Hello, John!

> I think s.foldLeft(op) is the same as s.sequential().reduce(op).
> If that's the case, there's no need for it, except in documentation.

This implementation could be worse for parallel stream. First, as
Brian already mentioned, forEachOrdered (thus my proposed
implementation) could still benefit from parallelization, e.g. if
upstream contains expensive mapping operation. Second, upstream may
contain full stop operation (e.g. sorted()). In this case everything
before sorting including sorting itself would be parallelized nicely.
When you use sequential(), you force all the stream to be sequential,
killing the parallelism on all the stages.

Also note that we are speaking about extending the interface, not the
implementation. It's not specified how exactly reduce(op) must behave
and third-party implementations in wild may reduce in another order,
actually relying on op assotiativity. E.g. for stream of four elements
the implementation is free to reduce like op(op(e1, e2), op(e3, e4)).
As long as associativity holds, the result will be indistinguishable
from op(op(op(e1, e2), e3), e4). Using forEachOrdered is much more
reliable as its behavior is more constrained by documentation.

Finally this does not solve the problem with more popular version of
foldLeft which has seed element (possibly of another type). In this
case you cannot delegate the call to existing reduce without providing
a bogus combiner (and hoping that it's never called for sequential
stream, even though it's not specified).

> A quick, inexpert read of package-info.java suggests to me that
> the encounter order of a sequential ordered stream is stable,
> and therefore the result of executing a reduce on such a stream
> should always produce the same order of operations.  Which
> order?  Well of course, the one given in the javadoc.  Which
> is an implementation of your foldLeft, minus the unnecessary
> buffering.

My implementation has no unnecessary buffering. It just stores the
accumulator inside the forEachOrdered consumer. Practically the same
is made inside reduce implementation. The result of executing a reduce
could be the same even if operation order is different, because spec
requires that op must be associative, so een sequential reduce may
rely on associativity.

As for foldRight, some considerations about reverse() operation are
mentioned in Stuart's comment in the issue:
https://bugs.openjdk.java.net/browse/JDK-8133680?focusedCommentId=13861795&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13861795
Indeed creating reverse() operation would be nice, but it will require
much more effort to support in existing spliterators. Also we should
provide default implementation for Stream interface as well as more
optimal specific implementation for JDK Stream. This is doable, but of
course I cannot write it by myself without support from other OpenJDK
developers (or at least without strong evidence that such feature
could be accepted).

With best regards,
Tagir Valeev.


More information about the core-libs-dev mailing list