Potential bug in hotspot occasionally resulting in non-termination of parallel stream execution

Daniel D. Daugherty daniel.daugherty at oracle.com
Thu Apr 23 12:02:19 UTC 2015


On 4/22/15 10:31 AM, Paul Sandoz wrote:
> Hi,
>
> Amy and I think we have identified an issue in hotspot that only very occasionally results in non-termination of parallel stream execution. Specifically non-termination of stream fork/join tasks. Such failures, when running jtreg stream tests, manifest themselves as timeouts with jstack trace output like the following:
>
> "MainThread" #23 prio=5 os_prio=0 tid=0x00007f10a4183800 nid=0x5a6e in Object.wait() [0x00007f103e2a0000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
> 	at java.lang.Object.wait(Native Method)
> 	at java.util.concurrent.ForkJoinTask.externalAwaitDone(ForkJoinTask.java:334)
> 	- locked <0x00000000fc1c1aa8> (a java.util.stream.Nodes$SizedCollectorTask$OfRef)
> 	at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:405)
> 	at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:734)
> 	at java.util.stream.Nodes.collect(Nodes.java:325)
> 	at java.util.stream.ReferencePipeline.evaluateToNode(ReferencePipeline.java:109)
> 	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:564)
> 	at java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:255)
> 	at java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:438)
> 	at java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:444)
> 	at java.util.stream.StreamTestScenario$12._run(StreamTestScenario.java:144)
> 	at java.util.stream.StreamTestScenario.run(StreamTestScenario.java:220)
> 	at java.util.stream.OpTestCase$ExerciseDataStreamBuilder.exercise(OpTestCase.java:349)
> 	at java.util.stream.OpTestCase.exerciseOpsMulti(OpTestCase.java:114)
> 	at java.util.stream.OpTestCase.exerciseOpsInt(OpTestCase.java:136)
> 	at org.openjdk.tests.java.util.stream.MapOpTest.testOps(MapOpTest.java:74)
>
> i.e. a main f/j task is waiting for decedents to complete.
>
> Amy has been doing a lot of testing (since the failure happens very occasionally) and can provide more details on that and the results. I will provide some specific details below.
>
> By a process of elimination we could reproduce the failure in JDK 9 b53 but not in b52. From the changesets that were integrated into b53 we identified JDK-8061553 as a possible cause:
>
>    Contended Locking fast enter bucket
>    https://bugs.openjdk.java.net/browse/JDK-8061553
>    http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/30137e7eef47
>
> We tested with a latest dev build with (naturally) and without that changeset. So far we can reproduce the issue with the former, but not with the latter.
>
> This indicates the changeset for JDK-8061553 is the likely cause, however i really don't know why this would be the case. Expert advice very much appreciated!

Very nice job in narrowing this down. I concur that it is very
likely that JDK-8061553:

- directly introduced the hang as part of the optimization

or

- exposed a pre-existing problem due to the optimization

Dan


>
> --
>
> Separately there is another issue with Fork/Join:
>
>   http://cs.oswego.edu/pipermail/concurrency-interest/2015-April/014240.html
>
> At the moment i don't think the two are connected (the latter issue has been present since 8u40), but perhaps there is a combination of factors here. So we will also run some tests with a workaround:
>
>    http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent/ForkJoinPool.java?r1=1.240&r2=1.241
>
> just to rule this out.
>
> Paul.
>



More information about the hotspot-runtime-dev mailing list