RFR: JDK-8133680 add Stream.foldLeft() terminal operation
Paul Sandoz
paul.sandoz at oracle.com
Mon Aug 21 16:57:06 UTC 2017
Hi,
My preference, naming-wise, is to stick with the current naming scheme and refer to such operations as reduceLeft and reduceRight. We have used reduce both for the seed accepting, and optional returning methods, and we don’t use different names to distinguish those.
I don’t think we can rely on the sequential() trick, since as Tagir points out it is global.
My inclination would be to focus on reduceLeft, and spin off reduceRight into a separate issue and in addition ponder support for a reverse operation, since reduceRight can be built upon that.
It’s possible to develop a reduceRight or reserve operation that does something very similar to forEachOrdered when operating in parallel (the last FJ task has to complete before given the ok for it’s direct sibling to the left to proceed when ready). For good splitters like ArrayList this might work well, for bad splitters like Iterator it will be terrible. It may be possible in some cases to fuse reverse().toArray(). IMO pushing reserve further down into a characteristic of a Spliterator is going too far at this moment, as that is potentially a lot more work.
(Note naming-wise one could consider reduceOrdered, and then add reverse with no need for reduceRight, but i think the names and operations reduceRight/Left are likely familiar and i think makes it clear in the source the intent of the terminal operation).
So i am supportive of a reduceLeft, and may be supportive of a reduceRight+reverse.
In hindsight when we added forEachOrdered i wish i recognised that this was a void-returning reduceLeft.
Paul.
> On 19 Aug 2017, at 22:36, Tagir Valeev <amaembo at gmail.com> wrote:
>
> Hello, John!
>
>> I think s.foldLeft(op) is the same as s.sequential().reduce(op).
>> If that's the case, there's no need for it, except in documentation.
>
> This implementation could be worse for parallel stream. First, as
> Brian already mentioned, forEachOrdered (thus my proposed
> implementation) could still benefit from parallelization, e.g. if
> upstream contains expensive mapping operation. Second, upstream may
> contain full stop operation (e.g. sorted()). In this case everything
> before sorting including sorting itself would be parallelized nicely.
> When you use sequential(), you force all the stream to be sequential,
> killing the parallelism on all the stages.
>
> Also note that we are speaking about extending the interface, not the
> implementation. It's not specified how exactly reduce(op) must behave
> and third-party implementations in wild may reduce in another order,
> actually relying on op assotiativity. E.g. for stream of four elements
> the implementation is free to reduce like op(op(e1, e2), op(e3, e4)).
> As long as associativity holds, the result will be indistinguishable
> from op(op(op(e1, e2), e3), e4). Using forEachOrdered is much more
> reliable as its behavior is more constrained by documentation.
>
> Finally this does not solve the problem with more popular version of
> foldLeft which has seed element (possibly of another type). In this
> case you cannot delegate the call to existing reduce without providing
> a bogus combiner (and hoping that it's never called for sequential
> stream, even though it's not specified).
>
>> A quick, inexpert read of package-info.java suggests to me that
>> the encounter order of a sequential ordered stream is stable,
>> and therefore the result of executing a reduce on such a stream
>> should always produce the same order of operations. Which
>> order? Well of course, the one given in the javadoc. Which
>> is an implementation of your foldLeft, minus the unnecessary
>> buffering.
>
> My implementation has no unnecessary buffering. It just stores the
> accumulator inside the forEachOrdered consumer. Practically the same
> is made inside reduce implementation. The result of executing a reduce
> could be the same even if operation order is different, because spec
> requires that op must be associative, so een sequential reduce may
> rely on associativity.
>
> As for foldRight, some considerations about reverse() operation are
> mentioned in Stuart's comment in the issue:
> https://bugs.openjdk.java.net/browse/JDK-8133680?focusedCommentId=13861795&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13861795
> Indeed creating reverse() operation would be nice, but it will require
> much more effort to support in existing spliterators. Also we should
> provide default implementation for Stream interface as well as more
> optimal specific implementation for JDK Stream. This is doable, but of
> course I cannot write it by myself without support from other OpenJDK
> developers (or at least without strong evidence that such feature
> could be accepted).
>
> With best regards,
> Tagir Valeev.
More information about the core-libs-dev
mailing list