RFR: 7903580: Allow for re-attempting agent creation when an attempt fails [v2]

Mon Nov 13 18:51:31 UTC 2023

On Mon, 13 Nov 2023 14:02:51 GMT, Jaikiran Pai <jpai at openjdk.org> wrote:

>> Jaikiran Pai has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   update the tests to match the log message change
>
> src/share/classes/com/sun/javatest/regtest/exec/RegressionScript.java line 1241:
> 
>> 1239:                     p.log("Re-attempting agent creation, attempt number " + i);
>> 1240:                 }
>> 1241:                 Agent agent = p.getAgent(absTestScratchDir().toFile(), jdk, vmOpts.toList(),
> 
> I initially wanted to do this re-attempt logic within the `Pool.getAgent(...)` method, but the way the pool is implemented, its methods are all `synchronized`. The `Agent` instance creation which happens in the `pool.getAgent(...)` involves socket connection attempt (which can wait for a long time). This effectively is a stop the world operation for jtreg process because none of the other methods on the pool, like returning back an agent from an already complete test (action) stalls till this method returns. The `synchronization` constructs in the pool isn't something that's introduced in this PR and instead are already present (and may need some improvement, but that's for a separate discussion/fix). Adding this re-attempt logic within the `pool.getAgent(...)` is going to extend that "stall the world" operation. So I decided to have this implemented outside of that method. Since anyway this retry logic is an internal detail, we can move this logic into the `Pool`, once/if we impro
 ves the synchronization constructs in it.

Can you not make this blob of code be a static or unsynchronized method in `Agent.Pool`?

Approximately speaking, if you can do this code here, you should be able to do it there.

-------------

PR Review Comment: https://git.openjdk.org/jtreg/pull/173#discussion_r1391523349