Code Review - 6957683 TEST_BUG: test/java/util/concurrent/ThreadPoolExecutor/Custom.java failing
David Holmes
david.holmes at oracle.com
Tue Nov 22 02:33:10 UTC 2011
Gary,
On 22/11/2011 6:26 AM, Gary Adams wrote:
> Not quite as sure about the internal timeouts in this regression test.
>
> Here's a proposed change that cranks up the timeouts. Since the test
> harness defaults to 2 minutes there's no reason the service termination
> should timeout more quickly. For the thread startup, I added a countdown
> latch so the main thread waits til the worker threads have started.
>
> Also, after the service termination completes increased the delay to
> ensure the thread count check will be accurate.
>
> http://cr.openjdk.java.net/~gadams/6957683/
Sorry but I think my initial analysis of this problem was incorrect. The
initial test failure reported 3 failed cases:
1 not equal to 0
11 not equal to 10
1 not equal to 0
which corresponded to failures at the marked lines:
public static void main(String[] args) throws Throwable {
CustomTPE tpe = new CustomTPE();
equal(tpe.getCorePoolSize(), threadCount);
equal(countExecutorThreads(), 0);
for (int i = 0; i < threadCount; i++)
tpe.submit(new Runnable() { public void run() {}});
equal(countExecutorThreads(), threadCount);
equal(CustomTask.births.get(), threadCount);
tpe.shutdown();
tpe.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
Thread.sleep(10);
/*1*/ equal(countExecutorThreads(), 0);
CustomSTPE stpe = new CustomSTPE();
for (int i = 0; i < threadCount; i++)
stpe.submit(new Runnable() { public void run() {}});
equal(CustomSTPE.decorations.get(), threadCount);
/*2*/ equal(countExecutorThreads(), threadCount);
stpe.shutdown();
stpe.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
Thread.sleep(10);
/*3/ equal(countExecutorThreads(), 0);
System.out.printf("%nPassed = %d, failed = %d%n%n", passed,
failed);
if (failed > 0) throw new Exception("Some tests failed");
}
However, I think the failure at /*2*/ is actually seeing the same
"extra" thread as reported at /*1*/. It could even be the same thread
again at /*3*/ but we have other independent failure cases for /*3*/.
When we create the STPE each submit will create and start a worker
thread until we reach the core pool size. These worker threads will be
immediately visible to the countExecutorThreads() logic because it
simply enumerates the ThreadGroup. I was mistakenly thinking that these
threads need to actually get a chance to execute to become visible to
the counting logic, but that is not the case.
Consequently you can elide the change to add the latch and just leave
the defensive timeouts on the awaitTermination with the increased sleep
delays to give the worker threads a chance to really terminate.
Sorry for sending you down the wrong path on this one.
David
-----
More information about the core-libs-dev
mailing list