hg: lambda/lambda/jdk: Modify the conc-Node spliterator forEach and tryAdvance from using the call
Edward Harned
edharned at gmail.com
Tue Feb 19 11:20:16 PST 2013
Paul, thank you for the reply.
I’m looking for the root cause of this problem. When you talk about “A
conc-Node is an intermediate result produced from a parallel computation …”
It makes me think of the join() in the F/J framework. Is this where it all
starts?
The intermediate join() is a known problem in F/J in both Java7 and Java8.
Bypassing join() with CountedCompleter doesn’t solve the excessive use of
resources problem caused by recursive decomposition.
For conc-Node: recursive decomposition only works well on small balanced
trees. When the level of recursion becomes lengthy, there could be a
resource failure (such as Stack Overflow.) It makes perfect sense – do
more, do more, do more… At some point, it fails. The same problem as F/J
join().
The use of an ArrayDeque works for now by bypassing the calls, but as you
say filter or flatMap can still cause problems and it doesn’t address the
base problem as I see it: One thread is doing too much work. Recursive
decomposition doesn’t spread the work out to as many threads as possible;
it walks down the leaves of a balanced tree.
Am I on the right path here? streams/function packages are quite difficult
to follow sometimes.
If so, as I see it there needs to be more individual units-of-work spread
out to other threads and perhaps recursive decomposition isn’t the best
solution.
ed
On Tue, Feb 19, 2013 at 4:24 AM, Paul Sandoz <paul.sandoz at oracle.com> wrote:
> On Feb 18, 2013, at 7:44 PM, Edward Harned <edharned at gmail.com> wrote:
> > Is the stack problem for serial, parallel or both?
> >
>
> It can be both. A conc-Node is an intermediate result produced from a
> parallel computation which can be input to a another parallel or serial
> computation. It all depends at which Node in in the conc-Node tree
> traversal occurs at and the distance from that Node to the leaf Nodes.
>
>
> > Do you have an example for where the problem lies? (a little stack trace
> maybe)
> >
>
> The code to traverse the Spliterator of a conc-Node was recursive for both
> tryAdvance and forEach. For trees of large depth this can result in a stack
> overflow, especially on smaller memory configurations.
>
> Consider a right-balanced tree such as that of a spliterator from an
> iterator.
>
> For large input, say 10^6 elements, this can result in a tree depth of at
> least 1024 (it's gonna be marginally higher soon when we change the way
> that spliterator works) and the stack depth will be proportional to that,
> for tryAdvance it was 4x e.g.:
>
> "java.lang.StackOverflowError
> at
> java.util.stream.Nodes$InternalNodeSpliterator.internalTryAdvance(Nodes.java:559)
> at
> java.util.stream.Nodes$InternalNodeSpliterator$OfDouble.tryAdvance(Nodes.java:732)
> at
> java.util.stream.Nodes$InternalNodeSpliterator$OfDouble.tryAdvance(Nodes.java:737)
> at
> java.util.stream.Nodes$InternalNodeSpliterator$OfDouble.tryAdvance(Nodes.java:723)
> at
> java.util.stream.Nodes$InternalNodeSpliterator.internalTryAdvance(Nodes.java:566)
> at
> java.util.stream.Nodes$InternalNodeSpliterator$OfDouble.tryAdvance(Nodes.java:732)
> at
> java.util.stream.Nodes$InternalNodeSpliterator$OfDouble.tryAdvance(Nodes.java:737)
> at
> java.util.stream.Nodes$InternalNodeSpliterator$OfDouble.tryAdvance(Nodes.java:723)
> at
> java.util.stream.Nodes$InternalNodeSpliterator.internalTryAdvance(Nodes.java:562)
>
> It is possible to reduce that 4x factor but that is not solving the
> underlying problem. Also tryAdvance was rather inefficient creating n
> wrapping Spliterators. So overall i think it a reasonable tradeoff.
>
> The general aim is that a conc-Node represents the shape of the
> computation producing output elements i.e. it is already split for you for
> the next computation. The expectation is that splitting should occur all
> the way down to, or very close to, the leaf nodes. Note that operations
> like filter or flatMap (replace an element with zero or more elements) can
> distort the shape of the tree so it will not always be the case and it may
> be possible to create edge cases especially with right-balanced trees where
> things could blowup [*].
>
> So this is really fixing for edge cases where those expectations do not
> hold, while still being efficient for the expected fast path of using
> forEach traversal close to or at the leaf nodes.
>
> Hth,
> Paul.
>
> [*] Infact i should also update the count implementation to pre-compute
> the size when creating conc-Nodes, it any needs to be used so
> pre-calculating is more efficient.
>
> >
> > On Mon, Feb 18, 2013 at 9:10 AM, <paul.sandoz at oracle.com> wrote:
> > Changeset: 66c0dd07e288
> > Author: psandoz
> > Date: 2013-02-18 15:09 +0100
> > URL: http://hg.openjdk.java.net/lambda/lambda/jdk/rev/66c0dd07e288
> >
> > Modify the conc-Node spliterator forEach and tryAdvance from using the
> call
> > stack to using an explicit (Deque) stack and a depth first search to
> > find leaf nodes. This avoids stack overflow expceptions with tryAdvance
> > and forEach for right-balanced or degenerate treess. This also avoids
> > the internal creation of n spliterators when using tryAdvance, where n
> is the
> > depth of the tree.
> >
> > ! src/share/classes/java/util/stream/Nodes.java
> >
> >
> >
>
>
>
More information about the lambda-dev
mailing list