RFR: 8209595: MonitorVmStartTerminate.java timed out

Kevin Walls kevinw at openjdk.org
Fri Oct 20 13:30:54 UTC 2023


On Fri, 6 Oct 2023 19:10:50 GMT, Kevin Walls <kevinw at openjdk.org> wrote:

> From studying test failures, it looks like the way the test identifies its related processes is failing.
> It checks the mainArgs of a process by attaching, and looks like it occasionally misses getting a valid match.  The hasMainArgs method ignores exceptions as it is expecting some exceptions: it is going to test unrelated java process which happen to start.
> 
> It should retry this main args check on failure, but not too many times to be a burden on other valid unrelated processes, and should also log the PIDs that have an issue so we can see if this is part of any future failure.
> 
> Other small logging changes so we can see more easily the progress through the test.

Eventually I did reproduce a further failure with these test changes, where hasMainArgs is the issue.
For a valid test pid, we got the main args, and they did not match, but we know they SHOULD have due to the additional logging.

MonitoredVmUtil.mainArgs(target) can return "Unknown" or null, so we need to handle this, and not presume that the PID is NOT a test process.  We should retry the main args fetch if (monitoredArgs == null || monitoredArgs.equals("Unknown"))

Also, takeNap and the 100ms delay:
This can thrash and just fill logs with 10,000 lines of messages when failing.
Maybe it was short to reduce latency, but that does not seem criticial.  Make it longer.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16077#issuecomment-1772494516


More information about the serviceability-dev mailing list