RFR: 7903217: jtreg could try killing descendants of stuck test, before timing out the test [v4]

Gerard Ziemski gziemski at openjdk.org
Mon Nov 28 16:50:17 UTC 2022


On Fri, 18 Nov 2022 06:40:37 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> Hi Gerard,
> 
> What you are trying to do is useful and appreciated. I often was missing more info. But I'm unsure too if handling this at jtreg level is the best thing. But I see two sides here and therefore keep out of the discussion.
> 
> But another thing, in order for this to be useful, we would need thread dumps from hanging children too, if the children happen to be JVMs. Just hoping that abort(3) will nudge the children enough to vomit some output will not often work. E.g. if jcmd hangs, it is usually innocent: it waits for an answer from the attachee, and that one is stuck. It would be perfectly able to react to a thread dump and tell me as much.
> 
> So, before killing them, send each of them a SIGQUIT to get thread dumps and give them a bit of time to respond. And that raises more questions. If you do this, especially wholesale for all children, you could absolutely flood the jtr files with thread dumps from children, and analyzing them gets really confusing.
> 
> Not sure what a good solution could be. Let's see what others think.
> 
> Cheers, Thomas

Thank you Thomas for your feedback.

I really like your idea to SIGQUIT the children processes before force quitting them. And we can kill those intelligently to try and avoid leaving behind zombies.

There is no progress on many (all?) of the related bugs that I listed earlier. My hope is that logging more info from all processes involved would lead to a breakthrough that would allow the analysis to unblock. More info is the key in those issues I believe, so if there is an increase in some noise, then I think it is a fair price to pay.

Not sure what you mean by "flooding the jtr" issue - do you mean that too long jtr log files get trimmed ? I really dislike when that happens. Unsure when we decided to do this, but nowadays disk space should be plentiful to handle full log files without cutting them down, I would hope.

-------------

PR: https://git.openjdk.org/jtreg/pull/97


More information about the jtreg-dev mailing list