Potential bug in hotspot occasionally resulting in non-termination of parallel stream execution
Paul Sandoz
paul.sandoz at oracle.com
Wed Apr 22 16:31:08 UTC 2015
Hi,
Amy and I think we have identified an issue in hotspot that only very occasionally results in non-termination of parallel stream execution. Specifically non-termination of stream fork/join tasks. Such failures, when running jtreg stream tests, manifest themselves as timeouts with jstack trace output like the following:
"MainThread" #23 prio=5 os_prio=0 tid=0x00007f10a4183800 nid=0x5a6e in Object.wait() [0x00007f103e2a0000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Object.wait(Native Method)
at java.util.concurrent.ForkJoinTask.externalAwaitDone(ForkJoinTask.java:334)
- locked <0x00000000fc1c1aa8> (a java.util.stream.Nodes$SizedCollectorTask$OfRef)
at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:405)
at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:734)
at java.util.stream.Nodes.collect(Nodes.java:325)
at java.util.stream.ReferencePipeline.evaluateToNode(ReferencePipeline.java:109)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:564)
at java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:255)
at java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:438)
at java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:444)
at java.util.stream.StreamTestScenario$12._run(StreamTestScenario.java:144)
at java.util.stream.StreamTestScenario.run(StreamTestScenario.java:220)
at java.util.stream.OpTestCase$ExerciseDataStreamBuilder.exercise(OpTestCase.java:349)
at java.util.stream.OpTestCase.exerciseOpsMulti(OpTestCase.java:114)
at java.util.stream.OpTestCase.exerciseOpsInt(OpTestCase.java:136)
at org.openjdk.tests.java.util.stream.MapOpTest.testOps(MapOpTest.java:74)
i.e. a main f/j task is waiting for decedents to complete.
Amy has been doing a lot of testing (since the failure happens very occasionally) and can provide more details on that and the results. I will provide some specific details below.
By a process of elimination we could reproduce the failure in JDK 9 b53 but not in b52. From the changesets that were integrated into b53 we identified JDK-8061553 as a possible cause:
Contended Locking fast enter bucket
https://bugs.openjdk.java.net/browse/JDK-8061553
http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/30137e7eef47
We tested with a latest dev build with (naturally) and without that changeset. So far we can reproduce the issue with the former, but not with the latter.
This indicates the changeset for JDK-8061553 is the likely cause, however i really don't know why this would be the case. Expert advice very much appreciated!
--
Separately there is another issue with Fork/Join:
http://cs.oswego.edu/pipermail/concurrency-interest/2015-April/014240.html
At the moment i don't think the two are connected (the latter issue has been present since 8u40), but perhaps there is a combination of factors here. So we will also run some tests with a workaround:
http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent/ForkJoinPool.java?r1=1.240&r2=1.241
just to rule this out.
Paul.
More information about the hotspot-runtime-dev
mailing list