RFR: JDK-8216528: test/jdk/java/rmi/transport/runtimeThreadInheritanceLeak/RuntimeThreadInheritanceLeak.java failing with Xcomp

Jie Fu fujie at loongson.cn
Fri Feb 1 00:51:51 UTC 2019


Thanks Roger.

On 2019/1/31 下午11:16, Roger Riggs wrote:
> Pushed
>
> On 01/30/2019 10:12 PM, Jie Fu wrote:
>> Hi Roger,
>>
>> I really hope you can still sponsor this.
>> Could you help me, please?
>> Thanks again.
>>
>> Best regards,
>> Jie
>>
>>
>> On 2019/1/31 上午10:59, David Holmes wrote:
>>> On 31/01/2019 12:44 pm, Jie Fu wrote:
>>>> Hi David,
>>>>
>>>> I prefer the original patch[1].
>>>>
>>>> Could you please sponsor this issue or help me find a sponsor.
>>>> I really appreciate it. Thank you very much.
>>>
>>> Hopefully Roger will still sponsor this.
>>>
>>> Thanks,
>>> David
>>>
>>>> Also thanks Roger. We had a pleasant discussion offlist.
>>>>
>>>> Best regards,
>>>> Jie
>>>>
>>>> [1] 
>>>> https://mail.openjdk.java.net/pipermail/core-libs-dev/2019-January/057852.html 
>>>>
>>>>
>>>>
>>>> On 2019/1/31 上午10:09, David Holmes wrote:
>>>>> Hi Jie, Roger,
>>>>>
>>>>> I think this has now consumed far too many cycles for everyone, 
>>>>> dealing with a test that is checking for a leak that can't even 
>>>>> exist any more. Alan was fine with the original proposed patch, as 
>>>>> was I, so I think we can should proceed with that. Obviously there 
>>>>> is more than one way to tackle the Xcomp problem here, and there 
>>>>> will always be issues with any test relying on GC interaction, but 
>>>>> the proposed patch seems "good enough" to me.
>>>>>
>>>>> Cheers,
>>>>> David
>>>>>
>>>>> On 29/01/2019 11:46 am, Jie Fu wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I agree that the simplest way to fix the issue is just adding the 
>>>>>> reachabilityFence.
>>>>>> But this patch might also fail since the VM doesn't guarantee 
>>>>>> that a GC would be performed.
>>>>>>
>>>>>> I didn't make such patch since I've learned from Sergey and Alan 
>>>>>> that calling "System.gc()" several times is unreliable to trigger 
>>>>>> a gc[1].
>>>>>> So I still prefer the original one[2].
>>>>>>
>>>>>> Thanks a lot.
>>>>>>
>>>>>> Best regards,
>>>>>> Jie
>>>>>>
>>>>>> [1] 
>>>>>> https://mail.openjdk.java.net/pipermail/beans-dev/2019-January/000396.html 
>>>>>>
>>>>>> [2] 
>>>>>> https://mail.openjdk.java.net/pipermail/core-libs-dev/2019-January/057852.html 
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2019/1/28 下午11:39, Roger Riggs wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> The simplest fix for this failing test is to add a call to 
>>>>>>> reachabilityFence to prevent
>>>>>>> the loader and loaderRef from going out of scope early. It 
>>>>>>> maintains getting debug
>>>>>>> output on a failure.
>>>>>>>
>>>>>>> Offlist, Jie and I explored some alternate ways to write the 
>>>>>>> test and settled on this one.
>>>>>>>
>>>>>>> Please review:
>>>>>>>
>>>>>>> diff --git 
>>>>>>> a/test/jdk/java/rmi/transport/runtimeThreadInheritanceLeak/RuntimeThreadInheritanceLeak.java 
>>>>>>> b/test/jdk/java/rmi/transport/runtimeThreadInheritanceLeak/RuntimeThreadInheritanceLeak.java 
>>>>>>>
>>>>>>> --- 
>>>>>>> a/test/jdk/java/rmi/transport/runtimeThreadInheritanceLeak/RuntimeThreadInheritanceLeak.java 
>>>>>>>
>>>>>>> +++ 
>>>>>>> b/test/jdk/java/rmi/transport/runtimeThreadInheritanceLeak/RuntimeThreadInheritanceLeak.java 
>>>>>>>
>>>>>>> @@ -113,11 +113,13 @@ public class RuntimeThreadInheritanceLea
>>>>>>>              Reference dequeued = refQueue.remove(TIMEOUT);
>>>>>>>              if (dequeued == null) {
>>>>>>>                  System.err.println(
>>>>>>> -                    "TEST FAILED: loader not deteced weakly 
>>>>>>> reachable");
>>>>>>> +                    "TEST FAILED: loader not detected weakly 
>>>>>>> reachable");
>>>>>>>                  dumpThreads();
>>>>>>>                  throw new RuntimeException(
>>>>>>>                      "TEST FAILED: loader not detected weakly 
>>>>>>> reachable");
>>>>>>>              }
>>>>>>> +            Reference.reachabilityFence(loader);
>>>>>>> +            Reference.reachabilityFence(loaderRef);
>>>>>>>
>>>>>>>              System.err.println(
>>>>>>>                  "TEST PASSED: loader detected weakly reachable");
>>>>>>>
>>>>>>> Thanks, Roger
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 01/11/2019 07:25 PM, Jie Fu wrote:
>>>>>>>> Thanks David and Roger.
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2019年01月12日 06:52, David Holmes wrote:
>>>>>>>>> Hi Roger,
>>>>>>>>>
>>>>>>>>> On 12/01/2019 2:22 am, Roger Riggs wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> The proposed patch changes the test in a way that is unintended.
>>>>>>>>>>
>>>>>>>>>> Adding the infinite loop of gc() and sleep, will change the 
>>>>>>>>>> timeout behavior
>>>>>>>>>> from the existing timeout of TIMEOUT to the jtreg default 
>>>>>>>>>> timeout of the whole test.
>>>>>>>>>
>>>>>>>>> Partially true. If the new loop gets stuck then yes the jtreg 
>>>>>>>>> default timeout will apply - I don't see that is necessarily a 
>>>>>>>>> bad thing. The existing timeout only applies to the 
>>>>>>>>> refQueue.remove operation itself, you don't know how much time 
>>>>>>>>> was spent before you got there, nor how much will be spent 
>>>>>>>>> after in the dumpThreads() calls - so the jtreg timeout can 
>>>>>>>>> still come into affect.
>>>>>>>>>
>>>>>>>>>> Further, it renders the check at lines 114-120 irrelevant 
>>>>>>>>>> since loaderRef.get()
>>>>>>>>>> will have returned null and the ref will have been enqueued 
>>>>>>>>>> by then.
>>>>>>>>>
>>>>>>>>> I wouldn't say irrelevant as it double-checks the interaction 
>>>>>>>>> between the ref.get() and the queue.remove() - the result of 
>>>>>>>>> one should imply the result of the other, but if enqueuing had 
>>>>>>>>> a bug ....
>>>>>>>>>
>>>>>>>>>> While it is true that calling gc() only once is unreliable, a 
>>>>>>>>>> better fix is to
>>>>>>>>>> put the code from 108-120 in a loop with a fixed number of 
>>>>>>>>>> durations
>>>>>>>>>
>>>>>>>>> That would also work - say 5 loops and reduce TIMEOUT to 4000.
>>>>>>>>>
>>>>>>>>>> and add Reachability.reachabilityFence(loaderRef) to ensure 
>>>>>>>>>> the ref is not ignored.
>>>>>>>>>
>>>>>>>>> Adding ReachabilityFence, alone, may solve the observed 
>>>>>>>>> problem given one gc() seems to be working in practice (and 
>>>>>>>>> because we don't actually have the leaked loaders anymore 
>>>>>>>>> because those threads (sun.misc.GC threads) don't exist anymore).
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> Regards, Roger
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>
>



More information about the core-libs-dev mailing list