RFR: 8170896: TEST_BUG: java/rmi/server/Unreferenced/leaseCheckInterval/LeaseCheckInterval.java failed with unreferenced() not invoked after 20.0 seconds [v6]
Stuart Marks
smarks at openjdk.org
Tue Feb 3 06:00:07 UTC 2026
On Thu, 29 Jan 2026 10:25:34 GMT, Jaikiran Pai <jpai at openjdk.org> wrote:
>> Jaikiran Pai has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains nine additional commits since the last revision:
>>
>> - Stuart's review - no need to write start time in a file
>> - merge latest from master branch
>> - merge latest from master branch
>> - merge latest from master branch
>> - Mark's suggestion - move the duration check to separate method
>> - merge latest from master branch
>> - Mark's review - use CountDownLatch
>> - merge latest from master branch
>> - 8170896: TEST_BUG: java/rmi/server/Unreferenced/leaseCheckInterval/LeaseCheckInterval.java failed with unreferenced() not invoked after 20.0 seconds
>
> Thank you Stuart for the review.
>
>> The observation is that the timeout logic is entirely within the parent JVM. The child's role is simply to obtain a lease and halt immediately. The parent JVM could be changed to call JavaVM.waitFor() to await the child's termination and to obtain a timestamp. This would remove the need to create a temporary file into which the child writes a timestamp from which the parent reads the timestamp. In addition, waitFor() allows the parent to check the child's exit status and signal an error if the status is nonzero.
>>
>> I guess one could argue that having the child obtain the timestamp immediately after the registry call is closer to the beginning of the actual lease. However, I don't think this makes much of a practical difference.
>
> What you suggest is reasonable and simpler as well. I've updated the PR to remove the part where we used to write the start time to a file and then read it.
>
> With the updated log messages in this PR, I think if this continues to fail intermittently, we will have some useful log messages which can then help us decide if we have to track the start time more closely in the SelfTerminator process.
>
>
>> Another small cleanup item: is it necessary to have a @modules declaration in the test? I don't see anything in the test that uses RMI internals. (The Test Library might need this stuff, though, not sure.)
>
> I gave that a try, but turns out the call to a test library class in this test:
>
>> TestLibrary.getRegistryPort(localRegistry);
>
> requires the use of these explicit "--add-exports" (through the `@modules`) because `TestLibrary` references those internal classes.
@jaikiran Thanks for the updates.
Thanks for checking the `@modules` declaration. Yeah, the RMI test library does require some of that stuff after all.
Regarding the timestamp cleanup, yes, this looks good. One thing to consider is when to get the start timestamp. The current code gets it after the subprocess was started. An alternative is to get the timestamp after the subprocess has exited. Which is better? Not sure.
The start timestamp is trying to represent the time at which the DGC lease was obtained. To do this accurately we'd have to instrument the DGC code in the parent JVM. This seems like it's too intrusive though. Getting the timestamp (a) after the child has started or (b) after the child has exited are only approximations to this. The timestamp (a) is potentially too early, as we don't know how long it will actually take for the child to start up and obtain the lease. Thus, the measured time (a) might be too long. The timestamp (b) is potentially too late, as the child had previously obtained the timestamp and has exited, and the parent has been notified, before it gets the timestamp. Thus, the measured time (b) might be too short.
Maybe this makes no practical difference, but our CI environment is hostile enough that sometimes I'm startled by how long it takes to do something like forking a child process.
If the problem is that the test times out spuriously, it may be reasonable to consider moving the timestamp to (b) after the child has exited, so that the chance of spurious timeouts is reduced. However it's probably not worth fiddling around with this more at the moment. Let's just integrate this and if it works, great. But if the test continues to fail intermittently, maybe consider moving the start timestamp along with other possible adjustments.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/28919#issuecomment-3839226368
More information about the core-libs-dev
mailing list