RFR (S): 8220451: jdi/EventQueue/remove/remove004 failed due to "ERROR: thread2 is not alive"

Daniel D. Daugherty daniel.daugherty at oracle.com
Thu Mar 21 18:29:03 UTC 2019


On 3/21/19 2:02 PM, serguei.spitsyn at oracle.com wrote:
> On 3/21/19 09:17, Daniel D. Daugherty wrote:
>> On 3/21/19 2:58 AM, Nick Gasson wrote:
>>> Hi,
>>>
>>> Please review this small fix to a bug that causes the following 
>>> tests to fail when run with jtreg -timeoutFactor > 10 after the 
>>> changes in 8207367:
>>>
>>> vmTestbase/nsk/jdi/EventQueue/remove_l/remove_l004/TestDescription.java
>>> vmTestbase/nsk/jdi/EventQueue/remove/remove004/TestDescription.java
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8220451
>>> Webrev: http://cr.openjdk.java.net/~ngasson/8220451/webrev.0
>>
>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventQueue/remove/remove004/TestDescription.java 
>>
>>     No comments.
>>
>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventQueue/remove_l/remove_l004/TestDescription.java 
>>
>>     No comments.
>>
>> Thumbs up. You should wait to hear from someone on the current
>> Serviceability team for your second review.
>>
>> JDK-8220451 tracks the failure in EventQueue/remove/remove004.
>> JDK-8220456 tracks the failure in EventQueue/remove_l/remove_l004.
>>
>> You can list both bug IDs in the same changeset like this:
>>
>> 8220451: jdi/EventQueue/remove/remove004 failed due to "ERROR: 
>> thread2 is not alive"
>> 8220456: jdi/EventQueue/remove_l/remove_l004 failed due to "TIMEOUT 
>> while waiting for event"
>>
>> or you can use 8220451 and close 8220456 as a duplicate of 8220451.
>
>
> The 8220456 is assigned to Gary.

Yup.

> We can leave it up to Gary to close this issue (or not).

The reason we have two bugs is that the failure modes are
different so the same root cause is not obvious to the
casual log reader. I do recommend listing both bugs in
same changeset so both bugs show up as fixed.


>
> Also, I've checked that both bugs are not problem listed.

Correct. I filed these based on testing in my lab. I don't
anyone doing CI analysis has spotted them (other than me).

Dan


>
> Thanks,
> Serguei
>
>
>>
>> Thanks for fixing this issue.
>>
>> Dan
>>
>>
>>>
>>> This test creates a debugee process that sleeps for 5 * 
>>> timeoutFactor * 10000 ms or until it is signalled to stop, and the 
>>> parent sleeps for 5 * timeoutFactor * 1000 ms then signals the child 
>>> and checks no unexpected events were received in that time. However 
>>> the jtreg timeout factor is not passed from parent to debugee so the 
>>> debugee uses the default value 1.0. So if the jtreg timeout factor 
>>> is 12 the parent will sleep for 5 * 12 * 1000 = 60000 ms and the 
>>> debugee will sleep for 5 * 1 * 10000 = 50000 ms and exit before the 
>>> parent wakes up. The debugee exiting causes an unexpected event and 
>>> the test fails.
>>>
>>> Fix by passing the timeout factor system property to the debugee.
>>>
>>> Tested with `make test TEST="vmTestbase/nsk/jdi/EventQueue" 
>>> JTREG="TIMEOUT_FACTOR=12"' and JTREG="TIMEOUT_FACTOR=4".
>>>
>>> Thanks,
>>> Nick
>>>
>>
>



More information about the serviceability-dev mailing list