RFR: 8328716: [TestBug] Screen capturing utility for failed tests [v3]

Tue Apr 1 20:14:37 UTC 2025

On Mon, 31 Mar 2025 16:59:31 GMT, Andy Goryachev <angorya at openjdk.org> wrote:

>> Hmm. That isn't what I thought we were doing. I thought the idea was to annotate most (if not all all) screen capture tests, qualified by a system property, enable that property in our CI test runs, so that when we do get intermittent failures, we'll be able to take a look at them.
>> 
>> We _could_ to what you propose, but in that case I wouldn't want _any_ tests annotated as part of this or any other PR. Once we have this annotation on any test in our repo, then my objection about not having it on by default needs to be addressed, at which point you might as well annotate more tests, since we have seen occasional failures on many of them.
>
> There is more than one way to sk^H^H pet the cat.
> 
> We could use a property to disable (or rather, enable) the screenshots, and only enable the capture during the debugging session.  This will prevent us from catching those hard-to-reproduce intermittent tests that fail only occasionally.
> 
> The other concern is that in the case of some infrastructure misconfiguration (for example, a sudden loss of screen capture permission) would result in log file size explosion, and I don't think we have `tailwatch`-like facility to halt the test run when this happens.
> 
> I think the debug-only annotation is a meaningful compromise, but I would very much welcome other ideas.
> 
> (I am ok with removing the annotation from two tests that currently use it).

Yes, there are two main approaches we could take:

Option 1: A utility that is available for developers to add to one or more specific tests in a branch in their personal fork when debugging failures in those tests. We would not add it to any test in the mainline repo in this case. Since it is a developer-only utility, it isn't necessary to have a system property to enable it.

Option 2: A utility that is added to all robot tests in the repo, with a flag to enable it (off by default).

There are pros / cons of each approach. The second approach is more likely to catch a failing test that fails rarely and would easily allow us to enable it and run it on a new platform (e.g., for testing a new version of macOS or Ubuntu), although it is also more prone to the "runaway" screenshot problem unless we can come up with a good way to limit it.

I left a couple usability questions that are independent of this. It probably makes sense to address those first and then come back to this.

-------------

PR Review Comment: https://git.openjdk.org/jfx/pull/1746#discussion_r2023612881