JDK 9 RFR of JDK-8132548: java/lang/ThreadGroup/Stop.java fails with "RuntimeException: Failure"

David Holmes david.holmes at oracle.com
Tue Jul 12 10:03:05 UTC 2016


This looks fine to me.

Thanks,
David

On 12/07/2016 5:11 PM, Amy Lu wrote:
> Updated on the busy-loops, also removed the unnecessary statics, here
> comes the updated webrev:
>
> http://cr.openjdk.java.net/~amlu/8132548/webrev.04/
>
> Thanks,
> Amy
>
> On 7/12/16 2:33 PM, David Holmes wrote:
>> Hi Amy,
>>
>> On 12/07/2016 3:28 PM, Amy Lu wrote:
>>> Please help to review the updated webrev, getting rid of ThreadDeath and
>>> using join():
>>>
>>> http://cr.openjdk.java.net/~amlu/8132548/webrev.03/
>>
>> I still think a simple change to the original code suffices, but given
>> you have gone this far ...
>>
>> Busy-loops are "bad" and unnecessary:
>>
>> 40         while (true) { // keep it alive
>> 41         }
>>
>> This can be changed to a long blocking operation eg:
>>
>> while (true) {
>>   try {
>>     Thread.sleep(60000);
>>   }
>>   catch (InterruptedException shouldNotHappen) {
>>   }
>> }
>>
>> Thanks,
>> David
>>
>>> Thanks,
>>> Amy
>>>
>>> On 7/11/16 6:55 PM, David Holmes wrote:
>>>>
>>>>
>>>> On 11/07/2016 6:14 PM, Amy Lu wrote:
>>>>> Thank you David, though the email crossed-by, I hope all the concerns
>>>>> have been resolved in the updated webrev:
>>>>> http://cr.openjdk.java.net/~amlu/8132548/webrev.02/
>>>>
>>>> Sorry but no, the changes are neither necessary nor sufficient. With
>>>> the new code the ThreadDeath can hit anytime after countDown(), so it
>>>> can hit before you enter the try block.
>>>>
>>>>> I had rewrote the test with CountDownLatch and join(long millis). Also
>>>>> unlike the old version, test thread 'first' and 'second' do different
>>>>> things, and 'second' thread will keep alive for waiting to be
>>>>> killed by
>>>>> group.stop() from 'frist' thread in the same thread group. I think
>>>>> this
>>>>> could avoid the second call to second.stop() (group.stop()) issue.
>>>>
>>>> Simply using join() with no timeout and relying on the test framework
>>>> to kill the test allows you to do away with the second stop().
>>>>
>>>> David
>>>> -----
>>>>
>>>>> The time LONG_DELAY_MS used in join(long millis) is an adjusted
>>>>> timeout:
>>>>>
>>>>> private static final long LONG_DELAY_MS = Utils.adjustTimeout(1000L);
>>>>> ...
>>>>>
>>>>>         second.join(LONG_DELAY_MS);
>>>>>         boolean failed = second.isAlive();
>>>>>
>>>>> This gives second thread a more "reasonable" time to die, with taking
>>>>> account of the test harness timeout.
>>>>> This almost same as second.join()// test pass or timeout, but with
>>>>> additional chance of killing all threads in a bad case.
>>>>>
>>>>> Thanks,
>>>>> Amy
>>>>>
>>>>> On 7/11/16 3:25 PM, David Holmes wrote:
>>>>>> Simplification ...
>>>>>>
>>>>>> On 11/07/2016 5:12 PM, David Holmes wrote:
>>>>>>> Hi Amy,
>>>>>>>
>>>>>>> Thanks for tackling this.
>>>>>>>
>>>>>>> On 8/07/2016 4:01 PM, Amy Lu wrote:
>>>>>>>> Thank you Joe for your review.
>>>>>>>>
>>>>>>>> The intent is to give it more chance "for the thread group stop
>>>>>>>> to be
>>>>>>>> issued", not to extend the whole test execution timeout.
>>>>>>>>
>>>>>>>> I updated the webrev to make this in a retry, limit to 5 times of
>>>>>>>> retry:
>>>>>>>> http://cr.openjdk.java.net/~amlu/8132548/webrev.01/
>>>>>>>
>>>>>>> The retry serves no purpose here. groupStopped is being set to true
>>>>>>> and
>>>>>>> the waiting thread has been notified, so the loop will terminate
>>>>>>> after
>>>>>>> the first sleep. The failure happens when the main thread checks the
>>>>>>> isAlive() status of the second thread before the ThreadGroup.stop
>>>>>>> has
>>>>>>> actually had a chance to stop and terminate it - such that
>>>>>>> isAlive is
>>>>>>> now false. That is why I suggested waiting a bit longer by
>>>>>>> extending the
>>>>>>> sleep.
>>>>>>>
>>>>>>> I agree that making the test last at least 5 seconds is not ideal,
>>>>>>> but I
>>>>>>> didn't think it would be an issue in the big picture of testing.
>>>>>>>
>>>>>>> Ideally explicit "synchronization" is better than sleeps but that
>>>>>>> would
>>>>>>> again be missing the point with this test. We expect the thread to
>>>>>>> terminate, if it hasn't terminated in a "reasonable" time we
>>>>>>> consider
>>>>>>> the stop() to have failed and the test to fail. To that end we could
>>>>>>> remove the sleep altogether and change:
>>>>>>>
>>>>>>> boolean failed = second.isAlive();
>>>>>>>
>>>>>>> to
>>>>>>>
>>>>>>> try {
>>>>>>>   second.join(1000);
>>>>>>> } catch (InterruptedException shouldNotHappen) {}
>>>>>>> boolean failed = second.isAlive();
>>>>>>>
>>>>>>> Now we use explicit event synchronization - great! But the test has
>>>>>>> the
>>>>>>> same failure issue now as it did with the sleep. Putting in a
>>>>>>> CountDownLatch would have exactly the same problem: we expect the
>>>>>>> second
>>>>>>> thread to signal the latch as it terminates, but if that doesn't
>>>>>>> happen
>>>>>>> within a "reasonable" amount of time, then we deem the stop() to
>>>>>>> have
>>>>>>> failed and the test to have failed.
>>>>>>>
>>>>>>> Also note that the more code we add the more likely the second
>>>>>>> call to
>>>>>>> second.stop() triggers an async exception in code that can't
>>>>>>> handle it
>>>>>>> and results in an even worse failure mode!
>>>>>>>
>>>>>>> The only thing I can suggest is to get rid of the explicit sleep (or
>>>>>>> join, or latch.await() etc) and simply recode as an infinite loop
>>>>>>> and
>>>>>>> rely on the test harness to tear things down when the overall test
>>>>>>> timeout kicks in. That way the test either passes or is timed-out,
>>>>>>> there
>>>>>>> is no explicit failure - but a busy loop is also bad so you would
>>>>>>> want a
>>>>>>> small sleep in there anyway:
>>>>>>>
>>>>>>> while (second.isAlive()) {
>>>>>>>   Thread.sleep(10);
>>>>>>> }
>>>>>>> // Test passed - if we never get here the test times out and
>>>>>>> // we implicitly fail.
>>>>>>
>>>>>> Of course that was silly all you need is:
>>>>>>
>>>>>> second.join();
>>>>>> // Test passed - if we never get here the test times out and
>>>>>> // we implicitly fail.
>>>>>>
>>>>>> David
>>>>>>
>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Amy
>>>>>>>>
>>>>>>>> On 7/8/16 12:15 PM, joe darcy wrote:
>>>>>>>>> Hi Amy,
>>>>>>>>>
>>>>>>>>> I'm a bit uncomfortable with the fix as-is.
>>>>>>>>>
>>>>>>>>> Rather than hard-coding sleep values, if sleep values are needed I
>>>>>>>>> think it is a better practice to use ones that are scaled with the
>>>>>>>>> jtreg timeout factors, etc. used to run the tests. Please instead
>>>>>>>>> use
>>>>>>>>> something like the adjustTimeout method of
>>>>>>>>>
>>>>>>>>> $JDK_FOREST_ROOT/test/lib/share/classes/jdk/test/lib/Utils
>>>>>>>>>
>>>>>>>>> As a general comment, I'd prefer we don't just up timeout
>>>>>>>>> values for
>>>>>>>>> tests. That can cause the whole test suite run to slow down,
>>>>>>>>> which is
>>>>>>>>> undesirable especially if the condition in question may
>>>>>>>>> actually be
>>>>>>>>> satisfied in many cases much faster than the timeout value.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> -Joe
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 7/7/2016 7:01 PM, Amy Lu wrote:
>>>>>>>>>> Please review this trivial fix for
>>>>>>>>>> test:java/lang/ThreadGroup/Stop.java
>>>>>>>>>>
>>>>>>>>>> Though this is a test for a deprecated API, failed with very
>>>>>>>>>> very low
>>>>>>>>>> frequency and hard to reproduce (I got no luck to reproduce it),
>>>>>>>>>> I’d
>>>>>>>>>> like to patch it as suggested: extend the sleep in the main
>>>>>>>>>> thread
>>>>>>>>>> from one second to five seconds. Also added 'volatile' to the
>>>>>>>>>> boolean
>>>>>>>>>> variable 'groupStopped'.
>>>>>>>>>>
>>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8132548
>>>>>>>>>> webrev: http://cr.openjdk.java.net/~amlu/8132548/webrev.00/
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Amy
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --- old/test/java/lang/ThreadGroup/Stop.java 2016-07-04
>>>>>>>>>> 14:53:59.000000000 +0800
>>>>>>>>>> +++ new/test/java/lang/ThreadGroup/Stop.java 2016-07-04
>>>>>>>>>> 14:53:58.000000000 +0800
>>>>>>>>>> @@ -1,5 +1,5 @@
>>>>>>>>>>  /*
>>>>>>>>>> - * Copyright (c) 1999, 2011, Oracle and/or its affiliates. All
>>>>>>>>>> rights reserved.
>>>>>>>>>> + * Copyright (c) 1999, 2016, Oracle and/or its affiliates. All
>>>>>>>>>> rights reserved.
>>>>>>>>>>   * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>>>>>>>>>>   *
>>>>>>>>>>   * This code is free software; you can redistribute it and/or
>>>>>>>>>> modify it
>>>>>>>>>> @@ -29,7 +29,7 @@
>>>>>>>>>>   */
>>>>>>>>>>
>>>>>>>>>>  public class Stop implements Runnable {
>>>>>>>>>> -    private static boolean groupStopped = false ;
>>>>>>>>>> +    private static volatile boolean groupStopped = false ;
>>>>>>>>>>      private static final Object lock = new Object();
>>>>>>>>>>
>>>>>>>>>>      private static final ThreadGroup group = new
>>>>>>>>>> ThreadGroup("");
>>>>>>>>>> @@ -70,7 +70,7 @@
>>>>>>>>>>              while (!groupStopped) {
>>>>>>>>>>                  lock.wait();
>>>>>>>>>>                  // Give the other thread a chance to stop
>>>>>>>>>> -                Thread.sleep(1000);
>>>>>>>>>> +                Thread.sleep(5000);
>>>>>>>>>>              }
>>>>>>>>>>          }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>
>>>
>


More information about the core-libs-dev mailing list