JDK 9 RFR of JDK-8132548: java/lang/ThreadGroup/Stop.java fails with "RuntimeException: Failure"

Amy Lu amy.lu at oracle.com
Tue Jul 12 07:11:14 UTC 2016


Updated on the busy-loops, also removed the unnecessary statics, here 
comes the updated webrev:

http://cr.openjdk.java.net/~amlu/8132548/webrev.04/

Thanks,
Amy

On 7/12/16 2:33 PM, David Holmes wrote:
> Hi Amy,
>
> On 12/07/2016 3:28 PM, Amy Lu wrote:
>> Please help to review the updated webrev, getting rid of ThreadDeath and
>> using join():
>>
>> http://cr.openjdk.java.net/~amlu/8132548/webrev.03/
>
> I still think a simple change to the original code suffices, but given 
> you have gone this far ...
>
> Busy-loops are "bad" and unnecessary:
>
> 40         while (true) { // keep it alive
> 41         }
>
> This can be changed to a long blocking operation eg:
>
> while (true) {
>   try {
>     Thread.sleep(60000);
>   }
>   catch (InterruptedException shouldNotHappen) {
>   }
> }
>
> Thanks,
> David
>
>> Thanks,
>> Amy
>>
>> On 7/11/16 6:55 PM, David Holmes wrote:
>>>
>>>
>>> On 11/07/2016 6:14 PM, Amy Lu wrote:
>>>> Thank you David, though the email crossed-by, I hope all the concerns
>>>> have been resolved in the updated webrev:
>>>> http://cr.openjdk.java.net/~amlu/8132548/webrev.02/
>>>
>>> Sorry but no, the changes are neither necessary nor sufficient. With
>>> the new code the ThreadDeath can hit anytime after countDown(), so it
>>> can hit before you enter the try block.
>>>
>>>> I had rewrote the test with CountDownLatch and join(long millis). Also
>>>> unlike the old version, test thread 'first' and 'second' do different
>>>> things, and 'second' thread will keep alive for waiting to be 
>>>> killed by
>>>> group.stop() from 'frist' thread in the same thread group. I think 
>>>> this
>>>> could avoid the second call to second.stop() (group.stop()) issue.
>>>
>>> Simply using join() with no timeout and relying on the test framework
>>> to kill the test allows you to do away with the second stop().
>>>
>>> David
>>> -----
>>>
>>>> The time LONG_DELAY_MS used in join(long millis) is an adjusted 
>>>> timeout:
>>>>
>>>> private static final long LONG_DELAY_MS = Utils.adjustTimeout(1000L);
>>>> ...
>>>>
>>>>         second.join(LONG_DELAY_MS);
>>>>         boolean failed = second.isAlive();
>>>>
>>>> This gives second thread a more "reasonable" time to die, with taking
>>>> account of the test harness timeout.
>>>> This almost same as second.join()// test pass or timeout, but with
>>>> additional chance of killing all threads in a bad case.
>>>>
>>>> Thanks,
>>>> Amy
>>>>
>>>> On 7/11/16 3:25 PM, David Holmes wrote:
>>>>> Simplification ...
>>>>>
>>>>> On 11/07/2016 5:12 PM, David Holmes wrote:
>>>>>> Hi Amy,
>>>>>>
>>>>>> Thanks for tackling this.
>>>>>>
>>>>>> On 8/07/2016 4:01 PM, Amy Lu wrote:
>>>>>>> Thank you Joe for your review.
>>>>>>>
>>>>>>> The intent is to give it more chance "for the thread group stop 
>>>>>>> to be
>>>>>>> issued", not to extend the whole test execution timeout.
>>>>>>>
>>>>>>> I updated the webrev to make this in a retry, limit to 5 times of
>>>>>>> retry:
>>>>>>> http://cr.openjdk.java.net/~amlu/8132548/webrev.01/
>>>>>>
>>>>>> The retry serves no purpose here. groupStopped is being set to true
>>>>>> and
>>>>>> the waiting thread has been notified, so the loop will terminate 
>>>>>> after
>>>>>> the first sleep. The failure happens when the main thread checks the
>>>>>> isAlive() status of the second thread before the ThreadGroup.stop 
>>>>>> has
>>>>>> actually had a chance to stop and terminate it - such that 
>>>>>> isAlive is
>>>>>> now false. That is why I suggested waiting a bit longer by
>>>>>> extending the
>>>>>> sleep.
>>>>>>
>>>>>> I agree that making the test last at least 5 seconds is not ideal,
>>>>>> but I
>>>>>> didn't think it would be an issue in the big picture of testing.
>>>>>>
>>>>>> Ideally explicit "synchronization" is better than sleeps but that
>>>>>> would
>>>>>> again be missing the point with this test. We expect the thread to
>>>>>> terminate, if it hasn't terminated in a "reasonable" time we 
>>>>>> consider
>>>>>> the stop() to have failed and the test to fail. To that end we could
>>>>>> remove the sleep altogether and change:
>>>>>>
>>>>>> boolean failed = second.isAlive();
>>>>>>
>>>>>> to
>>>>>>
>>>>>> try {
>>>>>>   second.join(1000);
>>>>>> } catch (InterruptedException shouldNotHappen) {}
>>>>>> boolean failed = second.isAlive();
>>>>>>
>>>>>> Now we use explicit event synchronization - great! But the test has
>>>>>> the
>>>>>> same failure issue now as it did with the sleep. Putting in a
>>>>>> CountDownLatch would have exactly the same problem: we expect the
>>>>>> second
>>>>>> thread to signal the latch as it terminates, but if that doesn't
>>>>>> happen
>>>>>> within a "reasonable" amount of time, then we deem the stop() to 
>>>>>> have
>>>>>> failed and the test to have failed.
>>>>>>
>>>>>> Also note that the more code we add the more likely the second 
>>>>>> call to
>>>>>> second.stop() triggers an async exception in code that can't 
>>>>>> handle it
>>>>>> and results in an even worse failure mode!
>>>>>>
>>>>>> The only thing I can suggest is to get rid of the explicit sleep (or
>>>>>> join, or latch.await() etc) and simply recode as an infinite loop 
>>>>>> and
>>>>>> rely on the test harness to tear things down when the overall test
>>>>>> timeout kicks in. That way the test either passes or is timed-out,
>>>>>> there
>>>>>> is no explicit failure - but a busy loop is also bad so you would
>>>>>> want a
>>>>>> small sleep in there anyway:
>>>>>>
>>>>>> while (second.isAlive()) {
>>>>>>   Thread.sleep(10);
>>>>>> }
>>>>>> // Test passed - if we never get here the test times out and
>>>>>> // we implicitly fail.
>>>>>
>>>>> Of course that was silly all you need is:
>>>>>
>>>>> second.join();
>>>>> // Test passed - if we never get here the test times out and
>>>>> // we implicitly fail.
>>>>>
>>>>> David
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> Thanks,
>>>>>>> Amy
>>>>>>>
>>>>>>> On 7/8/16 12:15 PM, joe darcy wrote:
>>>>>>>> Hi Amy,
>>>>>>>>
>>>>>>>> I'm a bit uncomfortable with the fix as-is.
>>>>>>>>
>>>>>>>> Rather than hard-coding sleep values, if sleep values are needed I
>>>>>>>> think it is a better practice to use ones that are scaled with the
>>>>>>>> jtreg timeout factors, etc. used to run the tests. Please instead
>>>>>>>> use
>>>>>>>> something like the adjustTimeout method of
>>>>>>>>
>>>>>>>> $JDK_FOREST_ROOT/test/lib/share/classes/jdk/test/lib/Utils
>>>>>>>>
>>>>>>>> As a general comment, I'd prefer we don't just up timeout 
>>>>>>>> values for
>>>>>>>> tests. That can cause the whole test suite run to slow down,
>>>>>>>> which is
>>>>>>>> undesirable especially if the condition in question may 
>>>>>>>> actually be
>>>>>>>> satisfied in many cases much faster than the timeout value.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> -Joe
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7/7/2016 7:01 PM, Amy Lu wrote:
>>>>>>>>> Please review this trivial fix for
>>>>>>>>> test:java/lang/ThreadGroup/Stop.java
>>>>>>>>>
>>>>>>>>> Though this is a test for a deprecated API, failed with very
>>>>>>>>> very low
>>>>>>>>> frequency and hard to reproduce (I got no luck to reproduce it),
>>>>>>>>> I’d
>>>>>>>>> like to patch it as suggested: extend the sleep in the main 
>>>>>>>>> thread
>>>>>>>>> from one second to five seconds. Also added 'volatile' to the
>>>>>>>>> boolean
>>>>>>>>> variable 'groupStopped'.
>>>>>>>>>
>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8132548
>>>>>>>>> webrev: http://cr.openjdk.java.net/~amlu/8132548/webrev.00/
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Amy
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --- old/test/java/lang/ThreadGroup/Stop.java 2016-07-04
>>>>>>>>> 14:53:59.000000000 +0800
>>>>>>>>> +++ new/test/java/lang/ThreadGroup/Stop.java 2016-07-04
>>>>>>>>> 14:53:58.000000000 +0800
>>>>>>>>> @@ -1,5 +1,5 @@
>>>>>>>>>  /*
>>>>>>>>> - * Copyright (c) 1999, 2011, Oracle and/or its affiliates. All
>>>>>>>>> rights reserved.
>>>>>>>>> + * Copyright (c) 1999, 2016, Oracle and/or its affiliates. All
>>>>>>>>> rights reserved.
>>>>>>>>>   * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>>>>>>>>>   *
>>>>>>>>>   * This code is free software; you can redistribute it and/or
>>>>>>>>> modify it
>>>>>>>>> @@ -29,7 +29,7 @@
>>>>>>>>>   */
>>>>>>>>>
>>>>>>>>>  public class Stop implements Runnable {
>>>>>>>>> -    private static boolean groupStopped = false ;
>>>>>>>>> +    private static volatile boolean groupStopped = false ;
>>>>>>>>>      private static final Object lock = new Object();
>>>>>>>>>
>>>>>>>>>      private static final ThreadGroup group = new 
>>>>>>>>> ThreadGroup("");
>>>>>>>>> @@ -70,7 +70,7 @@
>>>>>>>>>              while (!groupStopped) {
>>>>>>>>>                  lock.wait();
>>>>>>>>>                  // Give the other thread a chance to stop
>>>>>>>>> -                Thread.sleep(1000);
>>>>>>>>> +                Thread.sleep(5000);
>>>>>>>>>              }
>>>>>>>>>          }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>
>>



More information about the core-libs-dev mailing list