JDK 9 RFR of JDK-8132548: java/lang/ThreadGroup/Stop.java fails with "RuntimeException: Failure"

David Holmes david.holmes at oracle.com
Mon Jul 11 10:55:44 UTC 2016



On 11/07/2016 6:14 PM, Amy Lu wrote:
> Thank you David, though the email crossed-by, I hope all the concerns
> have been resolved in the updated webrev:
> http://cr.openjdk.java.net/~amlu/8132548/webrev.02/

Sorry but no, the changes are neither necessary nor sufficient. With the 
new code the ThreadDeath can hit anytime after countDown(), so it can 
hit before you enter the try block.

> I had rewrote the test with CountDownLatch and join(long millis). Also
> unlike the old version, test thread 'first' and 'second' do different
> things, and 'second' thread will keep alive for waiting to be killed by
> group.stop() from 'frist' thread in the same thread group. I think this
> could avoid the second call to second.stop() (group.stop()) issue.

Simply using join() with no timeout and relying on the test framework to 
kill the test allows you to do away with the second stop().

David
-----

> The time LONG_DELAY_MS used in join(long millis) is an adjusted timeout:
>
> private static final long LONG_DELAY_MS = Utils.adjustTimeout(1000L);
> ...
>
>         second.join(LONG_DELAY_MS);
>         boolean failed = second.isAlive();
>
> This gives second thread a more "reasonable" time to die, with taking
> account of the test harness timeout.
> This almost same as second.join()// test pass or timeout, but with
> additional chance of killing all threads in a bad case.
>
> Thanks,
> Amy
>
> On 7/11/16 3:25 PM, David Holmes wrote:
>> Simplification ...
>>
>> On 11/07/2016 5:12 PM, David Holmes wrote:
>>> Hi Amy,
>>>
>>> Thanks for tackling this.
>>>
>>> On 8/07/2016 4:01 PM, Amy Lu wrote:
>>>> Thank you Joe for your review.
>>>>
>>>> The intent is to give it more chance "for the thread group stop to be
>>>> issued", not to extend the whole test execution timeout.
>>>>
>>>> I updated the webrev to make this in a retry, limit to 5 times of
>>>> retry:
>>>> http://cr.openjdk.java.net/~amlu/8132548/webrev.01/
>>>
>>> The retry serves no purpose here. groupStopped is being set to true and
>>> the waiting thread has been notified, so the loop will terminate after
>>> the first sleep. The failure happens when the main thread checks the
>>> isAlive() status of the second thread before the ThreadGroup.stop has
>>> actually had a chance to stop and terminate it - such that isAlive is
>>> now false. That is why I suggested waiting a bit longer by extending the
>>> sleep.
>>>
>>> I agree that making the test last at least 5 seconds is not ideal, but I
>>> didn't think it would be an issue in the big picture of testing.
>>>
>>> Ideally explicit "synchronization" is better than sleeps but that would
>>> again be missing the point with this test. We expect the thread to
>>> terminate, if it hasn't terminated in a "reasonable" time we consider
>>> the stop() to have failed and the test to fail. To that end we could
>>> remove the sleep altogether and change:
>>>
>>> boolean failed = second.isAlive();
>>>
>>> to
>>>
>>> try {
>>>   second.join(1000);
>>> } catch (InterruptedException shouldNotHappen) {}
>>> boolean failed = second.isAlive();
>>>
>>> Now we use explicit event synchronization - great! But the test has the
>>> same failure issue now as it did with the sleep. Putting in a
>>> CountDownLatch would have exactly the same problem: we expect the second
>>> thread to signal the latch as it terminates, but if that doesn't happen
>>> within a "reasonable" amount of time, then we deem the stop() to have
>>> failed and the test to have failed.
>>>
>>> Also note that the more code we add the more likely the second call to
>>> second.stop() triggers an async exception in code that can't handle it
>>> and results in an even worse failure mode!
>>>
>>> The only thing I can suggest is to get rid of the explicit sleep (or
>>> join, or latch.await() etc) and simply recode as an infinite loop and
>>> rely on the test harness to tear things down when the overall test
>>> timeout kicks in. That way the test either passes or is timed-out, there
>>> is no explicit failure - but a busy loop is also bad so you would want a
>>> small sleep in there anyway:
>>>
>>> while (second.isAlive()) {
>>>   Thread.sleep(10);
>>> }
>>> // Test passed - if we never get here the test times out and
>>> // we implicitly fail.
>>
>> Of course that was silly all you need is:
>>
>> second.join();
>> // Test passed - if we never get here the test times out and
>> // we implicitly fail.
>>
>> David
>>
>>
>>> Thanks,
>>> David
>>>
>>>> Thanks,
>>>> Amy
>>>>
>>>> On 7/8/16 12:15 PM, joe darcy wrote:
>>>>> Hi Amy,
>>>>>
>>>>> I'm a bit uncomfortable with the fix as-is.
>>>>>
>>>>> Rather than hard-coding sleep values, if sleep values are needed I
>>>>> think it is a better practice to use ones that are scaled with the
>>>>> jtreg timeout factors, etc. used to run the tests. Please instead use
>>>>> something like the adjustTimeout method of
>>>>>
>>>>> $JDK_FOREST_ROOT/test/lib/share/classes/jdk/test/lib/Utils
>>>>>
>>>>> As a general comment, I'd prefer we don't just up timeout values for
>>>>> tests. That can cause the whole test suite run to slow down, which is
>>>>> undesirable especially if the condition in question may actually be
>>>>> satisfied in many cases much faster than the timeout value.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -Joe
>>>>>
>>>>>
>>>>> On 7/7/2016 7:01 PM, Amy Lu wrote:
>>>>>> Please review this trivial fix for
>>>>>> test:java/lang/ThreadGroup/Stop.java
>>>>>>
>>>>>> Though this is a test for a deprecated API, failed with very very low
>>>>>> frequency and hard to reproduce (I got no luck to reproduce it), I’d
>>>>>> like to patch it as suggested: extend the sleep in the main thread
>>>>>> from one second to five seconds. Also added 'volatile' to the boolean
>>>>>> variable 'groupStopped'.
>>>>>>
>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8132548
>>>>>> webrev: http://cr.openjdk.java.net/~amlu/8132548/webrev.00/
>>>>>>
>>>>>> Thanks,
>>>>>> Amy
>>>>>>
>>>>>>
>>>>>> --- old/test/java/lang/ThreadGroup/Stop.java    2016-07-04
>>>>>> 14:53:59.000000000 +0800
>>>>>> +++ new/test/java/lang/ThreadGroup/Stop.java    2016-07-04
>>>>>> 14:53:58.000000000 +0800
>>>>>> @@ -1,5 +1,5 @@
>>>>>>  /*
>>>>>> - * Copyright (c) 1999, 2011, Oracle and/or its affiliates. All
>>>>>> rights reserved.
>>>>>> + * Copyright (c) 1999, 2016, Oracle and/or its affiliates. All
>>>>>> rights reserved.
>>>>>>   * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>>>>>>   *
>>>>>>   * This code is free software; you can redistribute it and/or
>>>>>> modify it
>>>>>> @@ -29,7 +29,7 @@
>>>>>>   */
>>>>>>
>>>>>>  public class Stop implements Runnable {
>>>>>> -    private static boolean groupStopped = false ;
>>>>>> +    private static volatile boolean groupStopped = false ;
>>>>>>      private static final Object lock = new Object();
>>>>>>
>>>>>>      private static final ThreadGroup group = new ThreadGroup("");
>>>>>> @@ -70,7 +70,7 @@
>>>>>>              while (!groupStopped) {
>>>>>>                  lock.wait();
>>>>>>                  // Give the other thread a chance to stop
>>>>>> -                Thread.sleep(1000);
>>>>>> +                Thread.sleep(5000);
>>>>>>              }
>>>>>>          }
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>


More information about the core-libs-dev mailing list