RFR: 8230781: Add JTREG_FAILURE_HANDLER_TIMEOUT to control timeout handler timeout

Leonid Mesnik leonid.mesnik at oracle.com
Tue Sep 10 02:42:05 UTC 2019



> On Sep 9, 2019, at 5:55 PM, David Holmes <david.holmes at oracle.com> wrote:
> 
> On 10/09/2019 10:43 am, Leonid Mesnik wrote:
>> Hi
>> On 9/9/19 5:12 PM, David Holmes wrote:
>>> Hi Leonid,
>>> 
>>> On 10/09/2019 10:03 am, Leonid Mesnik wrote:
>>>> Hi
>>>> 
>>>> Could you please review following fix which add JTREG_FAILURE_HANDLER_TIMEOUT option to customize timeout handler timeout.
>>>> 
>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8230781/webrev.00/
>>> 
>>> In terms of adding the flag the change itself appears okay to me.
>>> 
>>> But what exactly is this timeout-handler timeout? We have enough issues with the timeout-handler running in response to test timeouts, let alone an additional timeout on top of that.
>>> 
>> This timeout defines how long jtreg should wait for timeout handler completion. It helps to identify if timeout handler has any hangs during processing. Also it helps jtreg to complete in guaranteed time.  By default this feature  is disabled.  I want to enable it for some stress tests.
>> http://openjdk.java.net/jtreg/faq.html#what-do-i-need-to-know-about-test-timeouts 
> 
> Okay but what exactly happens with this timeout processing? Let's say the timeout handler is trying to run jstack and has hung, and this new timeout elapses - what happens? Will the attempt to execute jstack abort, or will something try to kill it? Can we end up with an orphaned hung jstack process on the machine?
The exact process of test timeout handling  is:
When test times out jtreg start process handler in separate thread and wait for it's completion . 
The current implementation of timeout handler run a bunch of tools as separate processes. Each tool is executed with it's own timeout and should be killed if hangs.
This "timeout handler" timeout  doesn't   kill any processes by itself. It only reports about error and continue execution. Indeed such behavior might left stray processes and might look useless for someone.

However this "timeout handler" timeout allows jtreg to complete test execution in the case if timeout handler stuck. I am going to use it for stress test execution where each task contains only single test.  So task completed with test failure instead of getting task failure because of task timeout. As you know our task execution system doesn't provide any test results  if task failed. So this is needed to somehow complete jtreg execution to get all results uploaded.

Also it is responsibility of task execution system to clean hosts between tasks. 

Basically this change is needed only to complete jtreg execution and get test failures instead of hang execution with task failure.

Leonid
              
> 
> These are really jtreg questions though :)
> 
> Cheers,
> David
> 
>> Leonid
>>> Thanks,
>>> David
>>> 
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8230781 <https://bugs.openjdk.java.net/browse/JDK-8230781>
>>>> 
>>>> I verified fix locally and with mach5.
>>>> 
>>>> Leonid




More information about the build-dev mailing list