Exceptional behavior of parallel stream
Edward Harned
edharned at gmail.com
Tue Aug 23 22:20:10 UTC 2016
An enhancement issue is nice, but what can you really do?
The reason the framework cannot handle exceptions in related threads is
because the framework doesn’t have a central thread management facility. It
spits out threads like the Executor Service it inherits without regard for
controlling those threads.
I gave Doug Lea a Data Parallel engine that properly managed threads back
in 2010. He ignored it. Here is a link to the updated project since I’m
using some code from that project below.
http://sourceforge.net/projects/tymeacdse
The management concept is that every request gets a Request Object. This
way the request can be timed, cancelled, inquired about, etc. and work
threads can know about exceptions in other work threads.
com.tymeac.dse.base.WorkThread::syncRequest
RequestItem local_request = getReq();
// When backing out (there was a problem somewhere else)
if (local_request.isBackout()) {
// decrement number of forked tasks and
// when number remaining is none
// say is complete with no completion object (failure)
if (local_request.decrementPendingWork() == 0)
local_request.setCompleted(null);
return;
}
Having an API that creates unmanaged threads is acceptable; really, what
else could a general purpose API do? Having a service, that controls all
parallel processing in JavaSE, that creates un-managed threads is not
acceptable.
Ed
On Tue, Aug 23, 2016 at 5:59 PM, Paul Sandoz <paul.sandoz at oracle.com> wrote:
> Hi Tagir,
>
> After some discussion with Doug i logged an issue:
>
> https://bugs.openjdk.java.net/browse/JDK-8164690 <
> https://bugs.openjdk.java.net/browse/JDK-8164690>
>
> but as of yet undecided whether we should do anything about it at the
> level of ForkJoinTask or specific to Stream (or even ConcurrentHashMap bulk
> ops).
>
> We could mention, as an implementation detail (@implNote), on Stream and
> CHM about the current behaviour. I am reluctant to specify normatively.
>
> Paul.
>
> > On 17 Aug 2016, at 06:01, Tagir F. Valeev <amaembo at gmail.com> wrote:
> >
> > Hello!
> >
> > I found no information in Stream API documentation on how it behaves
> > in case if exception occurs during the stream processing. While it's
> > quite evident for sequential stream (stream processing is terminated
> > and exception is propagated to the caller as is), the behavior for
> > parallel streams differs from one might expect. Consider the following
> > test:
> >
> > String[] data = IntStream.range(0, 100).mapToObj(String::valueOf)
> > .toArray(String[]::new);
> > data[20] = "oops";
> > try {
> > int sum = Arrays.stream(data).parallel().mapToInt(Integer::valueOf)
> > .peek(System.out::println).sum();
> > System.out.println("Sum is "+sum);
> > } catch (NumberFormatException e) {
> > System.out.println("Non-number appeared!");
> > }
> >
> > This parses integers stored in string array and sums them outputting
> > every number to stdout once it processed. As data set contains
> > non-number, the stream obviously fails with NumberFormatException. The
> > typical output looks like this:
> >
> > 62
> > 63
> > 12
> > 31
> > 87
> > ...
> > 28
> > 92
> > 29
> > 8
> > Non-number appeared!
> > 9
> > 30
> >
> > So as you can see, the stream is not actually terminated when
> > exception is thrown and caught: even after that some parallel tasks
> > continue running, and you see more numbers printed after catch block
> > is executed.
> >
> > I consider such behavior as confusing and unexpected. Given the fact
> > that stream may produce side-effects (e.g. if terminal op is forEach)
> > this might lead to unforeseen consequences in user programs as
> > left-over parallel stream tasks may continue mutate shared state after
> > main stream thread exceptionally returns the control to the caller.
> >
> > So I suggest that such behavior should be either fixed or explicitly
> > documented. What do you think?
> >
> > With best regards,
> > Tagir Valeev.
> >
>
>
More information about the core-libs-dev
mailing list