Potential bug in hotspot occasionally resulting in non-termination of parallel stream execution

Daniel D. Daugherty daniel.daugherty at oracle.com
Thu Apr 23 12:09:27 UTC 2015


On 4/22/15 11:19 PM, Amy Lu wrote:
> Here I’m providing test results in details.

Thanks for the details!


> We picked up Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz, 32 processors 
> machine, one stream test ToArrayOpTest for this testing. Normally this 
> test takes ~22 seconds to complete. We used longer enough timeout so 
> believe the “timeout” show in the testing is a real hang.

I have a similar machine in my lab in Colorado so I'll initially
investigate the failure there. Can you attach a zip archive of
a standalone copy of the ToArrayOpTest test with a script to run
it to JDK-8077392? That would help me get up and running on this
issue more quickly.


> * JDK9/b52: 3000 runs all pass.
> * JDK9/b53: Reproduced the issue, test timed out 4 times at run #596 
> #978 #988 #1290 in total 1568 runs.
>
> From the changesets that were integrated into b53 we identified 
> JDK-8061553 as a possible cause, and tested the latest dev build:
>
> * Latest dev build: Reproduced the issue, test timed out 4 times at 
> run #48 #143 #1877 #2231 in total 3000 runs
> * Backout 8061553 changeset from above build: 3000 runs all pass.
>
> The testing was done on two machines, Linux and Solaris, and got 
> similar results. Before drill down to b52/b53, we actually also tested 
> b55, b59 and both could reproduce the issue.

Glad that it reproduces on Solaris X64 since that's what I'm
running on my big server in my lab. Does this reproduce on
Solaris SPARC or MacOS X? It's OK if you haven't tried it
there since I think I have enough info to get started.

Dan



>
> Thanks,
> Amy
>
> On 4/23/15 12:31 AM, Paul Sandoz wrote:
>> Hi,
>>
>> Amy and I think we have identified an issue in hotspot that only very 
>> occasionally results in non-termination of parallel stream execution. 
>> Specifically non-termination of stream fork/join tasks. Such 
>> failures, when running jtreg stream tests, manifest themselves as 
>> timeouts with jstack trace output like the following:
>>
>> "MainThread" #23 prio=5 os_prio=0 tid=0x00007f10a4183800 nid=0x5a6e 
>> in Object.wait() [0x00007f103e2a0000]
>>    java.lang.Thread.State: BLOCKED (on object monitor)
>>     at java.lang.Object.wait(Native Method)
>>     at 
>> java.util.concurrent.ForkJoinTask.externalAwaitDone(ForkJoinTask.java:334)
>>     - locked <0x00000000fc1c1aa8> (a 
>> java.util.stream.Nodes$SizedCollectorTask$OfRef)
>>     at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:405)
>>     at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:734)
>>     at java.util.stream.Nodes.collect(Nodes.java:325)
>>     at 
>> java.util.stream.ReferencePipeline.evaluateToNode(ReferencePipeline.java:109)
>>     at 
>> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:564)
>>     at 
>> java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:255)
>>     at 
>> java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:438)
>>     at 
>> java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:444)
>>     at 
>> java.util.stream.StreamTestScenario$12._run(StreamTestScenario.java:144)
>>     at 
>> java.util.stream.StreamTestScenario.run(StreamTestScenario.java:220)
>>     at 
>> java.util.stream.OpTestCase$ExerciseDataStreamBuilder.exercise(OpTestCase.java:349)
>>     at java.util.stream.OpTestCase.exerciseOpsMulti(OpTestCase.java:114)
>>     at java.util.stream.OpTestCase.exerciseOpsInt(OpTestCase.java:136)
>>     at 
>> org.openjdk.tests.java.util.stream.MapOpTest.testOps(MapOpTest.java:74)
>>
>> i.e. a main f/j task is waiting for decedents to complete.
>>
>> Amy has been doing a lot of testing (since the failure happens very 
>> occasionally) and can provide more details on that and the results. I 
>> will provide some specific details below.
>>
>> By a process of elimination we could reproduce the failure in JDK 9 
>> b53 but not in b52. From the changesets that were integrated into b53 
>> we identified JDK-8061553 as a possible cause:
>>
>>    Contended Locking fast enter bucket
>>    https://bugs.openjdk.java.net/browse/JDK-8061553
>>    http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/30137e7eef47
>>
>> We tested with a latest dev build with (naturally) and without that 
>> changeset. So far we can reproduce the issue with the former, but not 
>> with the latter.
>>
>> This indicates the changeset for JDK-8061553 is the likely cause, 
>> however i really don't know why this would be the case. Expert advice 
>> very much appreciated!
>>
>> -- 
>>
>> Separately there is another issue with Fork/Join:
>>
>> http://cs.oswego.edu/pipermail/concurrency-interest/2015-April/014240.html
>>
>> At the moment i don't think the two are connected (the latter issue 
>> has been present since 8u40), but perhaps there is a combination of 
>> factors here. So we will also run some tests with a workaround:
>>
>> http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent/ForkJoinPool.java?r1=1.240&r2=1.241
>>
>> just to rule this out.
>>
>> Paul.
>
>



More information about the hotspot-runtime-dev mailing list